Commit graph

99 commits

Author SHA1 Message Date
d3d04739e7 Add DuckDuckGo lite search engine to stock data
This search engine works better than the others
2018-02-26 17:10:18 +01:00
b88aeffd5a Helpful README 2018-02-26 17:09:05 +01:00
2005c0f24f Add xml string gen 2018-02-26 17:03:27 +01:00
185c1cf8a4 Fix XML generation 2018-02-26 17:00:53 +01:00
9dd1954067 Partial runner fix 2018-02-26 17:00:53 +01:00
04270e88c0 Bug fix 2018-02-26 17:00:12 +01:00
6bc64ceb7a Add requirement for aiohttp 2018-02-26 16:38:16 +01:00
8cdc50c04e Fix stupid typo 2018-02-26 16:34:43 +01:00
22fa039f1b Remove debug print 2018-02-26 16:23:14 +01:00
e4ad8c7ce6 Towards a working XML export 2018-02-26 15:58:30 +01:00
67ad232533 Add a timeout to a single page retrieval 2018-02-26 15:42:36 +01:00
e140d4a8a7 Fix merge remanences 2018-02-26 15:37:05 +01:00
98fe69ba62 Real async crawling 2018-02-26 15:30:38 +01:00
968ff6d24c More robust crawling 2018-02-26 15:29:36 +01:00
5d4bd30e20 Exception handling 2018-02-26 15:15:03 +01:00
bdfa285e6b We do not want to use settings 2018-02-26 15:14:53 +01:00
65f777f00f Should get the objects and not the Manager 2018-02-26 15:04:26 +01:00
236e40d359 Sanity check 2018-02-26 14:57:46 +01:00
22017cea91 Typo in data u_u 2018-02-26 14:56:22 +01:00
549c861908 Bug fixé 2018-02-26 14:38:26 +01:00
517be1d822 Merge rdf branch 2018-02-26 14:11:06 +01:00
c4f63a92b2 Error in the merge, mea culpa 2018-02-26 14:01:29 +01:00
db067e56fc Typo 2018-02-26 13:59:34 +01:00
33bdae96e4 merge commit from histories_tobast into histories_models 2018-02-26 12:59:38 +01:00
526aad1364 Add interests 2018-02-26 12:33:23 +01:00
02e91bb2b7 Fix function calls 2018-02-26 11:56:02 +01:00
3e5fc2f9b3 Fix search engine URL generation 2018-02-26 11:49:24 +01:00
45ddbff91a Crawling and histories: fix a lot of stuff 2018-02-26 11:49:24 +01:00
e6d587bffd Actually save to DB a created history 2018-02-26 11:49:24 +01:00
8baf408e02 Use dict from data/nicknames_dict for nicknames 2018-02-26 11:49:24 +01:00
6463e348ac Fix populate.sh exec path 2018-02-26 11:48:51 +01:00
22064ebee3 Histories: xml import/export — untested
To be tested when history generation is available
2018-02-26 11:48:51 +01:00
a4de51b84a Crawl: do not use global SEARCH_ENGINES 2018-02-26 11:48:51 +01:00
4f0148cb63 Crawler: use a random fingerprint 2018-02-26 11:48:51 +01:00
4a8bd32516 Fix tor_runner import 2018-02-26 11:48:51 +01:00
44cf26df8f It can be useful to save a new object 2018-02-26 11:42:45 +01:00
adb892ab7d Check if crawling a search engine 2018-02-26 11:12:36 +01:00
15db8b4697 Change option name due to downgrade of aiohttp 2018-02-26 10:23:32 +01:00
d6b26c0a46 Better use of history 2018-02-26 10:05:33 +01:00
8f5c4f3f0f Use datetimes 2018-02-26 09:49:24 +01:00
71d9e18eec Add headers support 2018-02-25 23:56:51 +01:00
8ad46c0481 Bug fix, syntax erro 2018-02-25 21:59:29 +01:00
f66c978466 Tor runner has a run function to replay the history 2018-02-25 21:53:28 +01:00
0a676a2f65 PEP8 2018-02-25 21:34:20 +01:00
e074d96f02 tor_runner can make requests 2018-02-25 21:27:15 +01:00
93b235cb6c Fix interests import 2018-02-25 21:20:52 +01:00
ae5699c089 Basic tor runner 2018-02-25 19:42:58 +01:00
f7313ff659 Add populate.sh script 2018-02-25 16:16:04 +01:00
0661fe0f01 Fix path 2018-02-25 16:10:38 +01:00
4b19febdf6 Add interests 2018-02-25 16:10:22 +01:00