How to find love or adventure with the help of a crate.io and kibana
About the performance, quality and efficiency of Dating sites you can argue, you can find 101 reason the better at the club/bar/dopolnitelnie/Park to look for Dating. The fact that even ten or fifteen years ago caused laughter — now the mainstream. Hadn't you better try to use another opportunity to search and communicate in the Internet with the transition to meet you in life...

Geek option search technology, a screencast of the application under the cut. At the end of the article link to archive with the app running under the Apache License v2.0 and a small set of data for example.
Sounds priborami, isn't it!? The reality is somewhat more complicated: the army of bots and fake accounts, workers of the oldest profession, attempts Dating services to make the most money with the minimum results, and even thieves in search of prey. Even more interesting? Not all so sad and with the right approach, it's worth it!
The promise of a screencast app:
Consider programnow part to search for. Divide the task into two parts: how to draw owls:
the
To help us in a hurry to crate.io is a set of plug — ins to store binary data in the file system and execution of distributed SQL queries with the help of possibilities, which is already in the search server elasticsearch. In a nutshell, it shared nothing NoSQL database based and facebook Presto SQL parser and planner superstructure above it. The distributed solution from the world of big data, which we will use while in a single process on a single computer.
Why crate.io? We need somewhere to store photos and need Elasticsearch, and SQL can be useful for statistics and reports in the future. Will calm you down and this time we'll skip enterprise, hibernate, and JPA). As you see, work crate is no more difficult than with a relational database.
Kibana — HTML5 application that allows you to visualize data from elasticsearch to work with time series, to filter the data, save the search parameters in the form of dashboards.
As this may help in the search!? Minimum programming and maximum results.
To work with crate.io from Python, Ruby, PHP, Java — jdbc type 4 driver. But I prefer to include the elasticsearch REST API, which for some reason hide in a crate and will work through it.
In the file config/crate.yml add parameters
es.api.enabled: true
udc.enabled: false
The second parameter disables the usage reports crate.io sent via UDP to my server and I immediately deleted the binary files from the library monitoring sigar not to confuse your antivirus.
As such, the "box" becomes friendly to work via elasticsearch REST using spring data elasticsearch.
To start the server necessarily need java jre version 7 or older.
Launching the project bin/crate ( in the case of windows you need the file bin\crate.bat)
Utility command line or crash the web console
create a repository for photos with the title images.
the
Elasticsearch does not require us to determine the format of the data. In this solution, the devil is in the details, it is rather a topic for discussion in the comments to the article. I will specify data types explicitly using the Mapping API to avoid problems with the search and displayed in kibana.
Run the script that pumps out html pages with sites, parses html, and extracts the needed data and stores using REST API/ elasticsearch java client.
Be sure to load the json with index type = "default" to perform SQL queries.

the
What is the average age in the data from the example?
the
This script downloads the image, considers the sha1 digest and makes a http PUT for each photo in the crate.io:
Can check that there are entries in the blob.images:
the
Well, the data in the database!
Download the kibana archive and unpack in the directory plugins/kibana/_site. When you restart the server you will find the frontend plugin as the site.
In plugins/kibana/_site/config.js choose the address to the REST API Elasticserch
the
All changes in kibana minor, more of a khaki. On the right we would have to make your component configurable.
This piece angularJS template displays the rating selector for the field _id zoom in the table and photograph in the visible field of mainImage.
plugins/kibana/_site/app/panels/table/module.html
To display multiple images for a single record when viewing a record:
For the voting script, use jquery, which is already in kibana
plugins/kibana/_site/index.html
It is a challenge to the elasticsearch Update API to update fields of the document rate.
The programming ends. Then only the web interface!

Briefly about creating filters you have already seen in screencast in the beginning of the article.
It is also shown how to select a subrange of time on the histogram or using the timepicker. All your filters and settings can be saved in the form of dashboards in kibana and load when needed by name.
Beyond the scope of this article was the search using regular expressions, security service, monitoring and administration of the crate.io SQL queries through the jdbc or client for your programming language.
Again, to launch the project necessary jvm 7 or older.
Application data for example, you can download c dropbox (234MB tar.gz), unpack and run the *nix command:
bin/crate
or windows:
bin\crate.bat
Ready dashboards open in a browser:
Good luck with crate.io/kibana and real Dating!!!
PS Dropboxs decided not to deliver today(27.11.2014) archive. Please tell me in the comments some public file hosting will allow you to put 234Мб file with no restrictions on the number of downloads.

According to the results of your vote wrote an article "What we need to parse it. The basics of the webdriver API"
Article based on information from habrahabr.ru

Geek option search technology, a screencast of the application under the cut. At the end of the article link to archive with the app running under the Apache License v2.0 and a small set of data for example.
Sounds priborami, isn't it!? The reality is somewhat more complicated: the army of bots and fake accounts, workers of the oldest profession, attempts Dating services to make the most money with the minimum results, and even thieves in search of prey. Even more interesting? Not all so sad and with the right approach, it's worth it!
The promise of a screencast app:
Consider programnow part to search for. Divide the task into two parts: how to draw owls:
the
- the Second part dorisovyvaet the remaining owls. This is how to save data in hranilischa information, index them and write a front-end for searching and viewing data.
the First part — draw the oval. For us it is to find, collect and organize data for further search. Any programming language with a library client html with regular expressions or DOM/xPath. For me this part was not a problem as a developer with a solid background in the integration of it systems and development of distributed crawler for search startups Visuvi. If you think that this theme is interesting, have your say in the vote for a new topic. the
To help us in a hurry to crate.io is a set of plug — ins to store binary data in the file system and execution of distributed SQL queries with the help of possibilities, which is already in the search server elasticsearch. In a nutshell, it shared nothing NoSQL database based and facebook Presto SQL parser and planner superstructure above it. The distributed solution from the world of big data, which we will use while in a single process on a single computer.
Why crate.io? We need somewhere to store photos and need Elasticsearch, and SQL can be useful for statistics and reports in the future. Will calm you down and this time we'll skip enterprise, hibernate, and JPA). As you see, work crate is no more difficult than with a relational database.
Kibana — HTML5 application that allows you to visualize data from elasticsearch to work with time series, to filter the data, save the search parameters in the form of dashboards.
As this may help in the search!? Minimum programming and maximum results.
To work with crate.io from Python, Ruby, PHP, Java — jdbc type 4 driver. But I prefer to include the elasticsearch REST API, which for some reason hide in a crate and will work through it.
In the file config/crate.yml add parameters
es.api.enabled: true
udc.enabled: false
The second parameter disables the usage reports crate.io sent via UDP to my server and I immediately deleted the binary files from the library monitoring sigar not to confuse your antivirus.
As such, the "box" becomes friendly to work via elasticsearch REST using spring data elasticsearch.
To start the server necessarily need java jre version 7 or older.
Launching the project bin/crate ( in the case of windows you need the file bin\crate.bat)
Utility command line or crash the web console
http://localhost:4200/_plugin/crate-admin/#/console
create a repository for photos with the title images.
the
bin/crash -c "create blob table images clustered into 7 shards with (number_of_replicas=0)" +-----------------------+-----------+---------+-----------+---------+ | server_url | node_name | version | connected | message | +-----------------------+-----------+---------+-----------+---------+ | http://127.0.0.1:4200 | Brigade | 0.45.3 | TRUE | OK | CONNECT OK CREATE OK (1.104 sec)
Elasticsearch does not require us to determine the format of the data. In this solution, the devil is in the details, it is rather a topic for discussion in the comments to the article. I will specify data types explicitly using the Mapping API to avoid problems with the search and displayed in kibana.
data Types
{
"info": {
"mappings": {
"default": {
"properties": {
"accommodation": {
"type": "string",
"index": "not_analyzed"
},
"age": {
"type": "long"
},
"build": {
"type": "string",
"index": "not_analyzed"
},
"drinkingHabits": {
"type": "string",
"index": "not_analyzed"
},
"education": {
"type": "string",
"index": "not_analyzed"
},
"ethnicity": {
"type": "string",
"index": "not_analyzed"
},
"first": {
"type": "date",
"format": "basic_date_time"
},
"height": {
"type": "long"
},
"images": {
"type": "string"
},
"info": {
"properties": {
"": {
"type": "string"
},
"Weight": {
"type": "string"
},
"Appearance": {
"type": "string"
},
"Children": {
"type": "string"
},
"Languages": {
"type": "string"
},
"I want to find": {
"type": "string"
},
"Material provision": {
"type": "string"
},
"Education": {
"type": "string"
},
"Orientation": {
"type": "string"
},
"Drinking habits": {
"type": "string"
},
"Smoker": {
"type": "string"
},
"Relations": {
"type": "string"
},
"Get acquainted": {
"type": "string"
},
"Accommodation": {
"type": "string"
},
"Height": {
"type": "string"
},
"Body": {
"type": "string"
}
}
},
"kids": {
"type": "string",
"index": "not_analyzed"
},
"last": {
"type": "date",
"format": "basic_date_time"
},
"login": {
"type": "string"
},
"mainImage": {
"type": "string",
"index": "not_analyzed"
},
"message": {
"type": "string"
},
"readableLogin": {
"type": "boolean"
},
"realName": {
"type": "string"
},
"relationship": {
"type": "string",
"index": "not_analyzed"
},
"replyRate": {
"type": "long"
},
"searchingFor": {
"type": "string"
},
"self": {
"properties": {
"The friends I most value": {
"type": "string"
},
"In women, I especially value": {
"type": "string"
},
"In my life I set a goal": {
"type": "string"
},
"I especially appreciate": {
"type": "string"
},
"I have Pets": {
"type": "string"
},
"Of all the famous people I would like to be": {
"type": "string"
},
"How long will I be able to live without communication": {
"type": "string"
},
"The place where I would like to live in": {
"type": "string"
},
"My favorite dish": {
"type": "string"
},
"My education": {
"type": "string"
},
"My free time, I would like to spend": {
"type": "string"
},
"My favorite literary characters": {
"type": "string"
},
"My favorite musical performers": {
"type": "string"
},
"My favorite writers": {
"type": "string"
},
"My favorite movies": {
"type": "string"
},
"My favorite artists": {
"type": "string"
},
"My motto": {
"type": "string"
},
"My favorite city": {
"type": "string"
},
"The greatest happiness for me": {
"type": "string"
},
"The most startling discovery for me": {
"type": "string"
},
"The most attractive aspect of his character I believe": {
"type": "string"
},
"The most valuable advice I have received in life": {
"type": "string"
},
"I'd like to have children": {
"type": "string"
},
"I'm most proud of this achievement": {
"type": "string"
},
"My dream job": {
"type": "string"
}
},
"smoker": {
"type": "string",
"index": "not_analyzed"
},
"updated": {
"type": "date",
"format": "basic_date_time"
},
"viewed": {
"type": "long"
},
"weight": {
"type": "long"
}
}
}
}
}
}
Run the script that pumps out html pages with sites, parses html, and extracts the needed data and stores using REST API/ elasticsearch java client.
Be sure to load the json with index type = "default" to perform SQL queries.

an Example of one of the json documents.

the
cr> select count(*) from info; +----------+ | count(*) | +----------+ | 291 | +----------+ SELECT 1 row in set (0.030 sec)
What is the average age in the data from the example?
the
cr> select avg(age) from info; +---------------+ | avg(age) | +---------------+ | 24.7275862069 | +---------------+ SELECT 1 row in set (0.038 sec)
This script downloads the image, considers the sha1 digest and makes a http PUT for each photo in the crate.io:
"http://127.0.0.1:4200/_blobs/images/"+fileDigest
Can check that there are entries in the blob.images:
the
cr> select count(*) from blob.images; +----------+ | count(*) | +----------+ | 2813 | +----------+ SELECT 1 row in set (0.029 sec)
Well, the data in the database!
Download the kibana archive and unpack in the directory plugins/kibana/_site. When you restart the server you will find the frontend plugin as the site.
In plugins/kibana/_site/config.js choose the address to the REST API Elasticserch
the
<b>elasticsearch: "http://"+window.location.host</b>
All changes in kibana minor, more of a khaki. On the right we would have to make your component configurable.
This piece angularJS template displays the rating selector for the field _id zoom in the table and photograph in the visible field of mainImage.
plugins/kibana/_site/app/panels/table/module.html
Code to display a photo, the table, the vote for rating
<tr ng-click="toggle_details(event)" class="pointer">
<td ng-if="panel.fields.length<1"
bo-text="event._source|stringify|tableTruncate:panel.trimFactor:1"></td>
<td ng-show="panel.fields.length>0" ng-repeat="field in panel.fields"><span
ng-if="(!panel.localTime || panel.timeField != field) && field!='mainImage' && field!='_id'"
bo-html="(event.kibana.highlight[field]||event.kibana._source[field]) |tableHighlight | tableTruncate:panel.trimFactor:panel.fields.length"
class="table-field-value"></span>
<span ng-if="field=='_id' ">
<span ng-repeat="t in [0,2,3,4,5]">
<input type="radio" name="item_{{event.kibana._source[field]}}" value="{{t}}" onclick="postESUpdate('{{event.kibana._source["_index"]}}','{{event.kibana._source["_type"]}}','{{event.kibana._source[field]}}',{{t}})" ng-if="event.kibana._source["rate"]!=t">
<input type="radio" name="item_{{event.kibana._source[field]}}" value="{{t}}" onclick="postESUpdate('{{event.kibana._source["_index"]}}','{{event.kibana._source["_type"]}}','{{event.kibana._source[field]}}',{{t}})" ng-if="event.kibana._source["rate"]==t" checked > {{t}}
</span>
</span>
<span ng-if="field=='mainImage' "><img src="/_blobs/images/{{event.kibana._source[field]}}"/></span>
<span
ng-if="panel.localTime &&panel.timeField == field &&field!='mainImage'"
bo-html="event.sort[1]|tableLocalTime:event" class="table-field-value"></span>
</td>
</tr>
To display multiple images for a single record when viewing a record:
Code display all pictures
<tr ng-repeat="(key,value) in event.kibana._source track by $index"
ng-class-odd="'odd'">
<td style="word-wrap:break-word" bo-text="key"></td>
<td style="white-space:nowrap"><i class="icon-search pointer"
ng-click="build_search(key,value)"
bs-tooltip="'Add filter to match this value'"></i>
<i class="icon-ban-circle pointer" ng-click="build_search(key,value,true)"
bs-tooltip="'Add filter to NOT match this value'"></i> <i
class="pointer icon-th" ng-click="toggle_field(key)"
bs-tooltip="'Toggle table column'"></i></td>
<td style="white-space:pre-wrap;word-wrap:break-word">
<span ng-if=" key != 'images' " bo-html="value|noXml|urlLink|stringify"></span>
</tr>
For the voting script, use jquery, which is already in kibana
plugins/kibana/_site/index.html
Update rating in the json document on the server
function postESUpdate(index, type, id, rate){
$.ajax({
type: "POST",
url: "http://"+window.location.host+"/"+index+"/"+type+"/"+id+"/_update",
data: '{"doc": {rate:'+rate+'}}'
}).done(function(){//alert("success"
}).fail(function(){alert("error")});
}
It is a challenge to the elasticsearch Update API to update fields of the document rate.
The programming ends. Then only the web interface!

Briefly about creating filters you have already seen in screencast in the beginning of the article.
It is also shown how to select a subrange of time on the histogram or using the timepicker. All your filters and settings can be saved in the form of dashboards in kibana and load when needed by name.
Beyond the scope of this article was the search using regular expressions, security service, monitoring and administration of the crate.io SQL queries through the jdbc or client for your programming language.
Again, to launch the project necessary jvm 7 or older.
Application data for example, you can download c dropbox (234MB tar.gz), unpack and run the *nix command:
bin/crate
or windows:
bin\crate.bat
Ready dashboards open in a browser:
http://localhost:4200/_plugin/kibana/#/dashboard/elasticsearch/When%20first%20photo%20was%20uploaded
Good luck with crate.io/kibana and real Dating!!!
PS Dropboxs decided not to deliver today(27.11.2014) archive. Please tell me in the comments some public file hosting will allow you to put 234Мб file with no restrictions on the number of downloads.

According to the results of your vote wrote an article "What we need to parse it. The basics of the webdriver API"
Комментарии
Отправить комментарий