Commit Graph

108 Commits

Author SHA1 Message Date
Rémi Oudin 89d1f8301a Remove duplicated url in history 2018-02-26 17:46:49 +01:00
Théophile Bastian 379b53e6ce Fix printing in gen_history 2018-02-26 17:25:04 +01:00
Théophile Bastian c94841c17b Add gen_history django-admin command 2018-02-26 17:25:04 +01:00
Rémi Oudin 97107d9bec Merge branch 'master' of git.tobast.fr:tobast/mpri-webdam 2018-02-26 17:12:26 +01:00
Rémi Oudin dedc66bb9d Bug fix 2018-02-26 17:12:19 +01:00
Théophile Bastian d3d04739e7 Add DuckDuckGo lite search engine to stock data
This search engine works better than the others
2018-02-26 17:10:18 +01:00
Théophile Bastian b88aeffd5a Helpful README 2018-02-26 17:09:05 +01:00
Rémi Oudin 7c8ec7351c Merge branch 'master' of git.tobast.fr:tobast/mpri-webdam 2018-02-26 17:04:09 +01:00
Théophile Bastian 2005c0f24f Add xml string gen 2018-02-26 17:03:27 +01:00
Rémi Oudin 392e16b797 Merge branch 'histories_models' 2018-02-26 17:03:27 +01:00
Théophile Bastian 185c1cf8a4 Fix XML generation 2018-02-26 17:00:53 +01:00
Rémi Oudin 9dd1954067 Partial runner fix 2018-02-26 17:00:53 +01:00
Rémi Oudin 04270e88c0 Bug fix 2018-02-26 17:00:12 +01:00
Théophile Bastian 6bc64ceb7a Add requirement for aiohttp 2018-02-26 16:38:16 +01:00
Rémi Oudin 15e0c2a11c Partial runner fix 2018-02-26 16:37:51 +01:00
Rémi Oudin 2b07779f5c Bug fix 2018-02-26 16:37:32 +01:00
Théophile Bastian 8cdc50c04e Fix stupid typo 2018-02-26 16:34:43 +01:00
Rémi Oudin 22fa039f1b Remove debug print 2018-02-26 16:23:14 +01:00
Théophile Bastian e4ad8c7ce6 Towards a working XML export 2018-02-26 15:58:30 +01:00
Théophile Bastian 67ad232533 Add a timeout to a single page retrieval 2018-02-26 15:42:36 +01:00
Théophile Bastian e140d4a8a7 Fix merge remanences 2018-02-26 15:37:05 +01:00
Théophile Bastian 98fe69ba62 Real async crawling 2018-02-26 15:30:38 +01:00
Théophile Bastian 968ff6d24c More robust crawling 2018-02-26 15:29:36 +01:00
Rémi Oudin 5d4bd30e20 Exception handling 2018-02-26 15:15:03 +01:00
Rémi Oudin bdfa285e6b We do not want to use settings 2018-02-26 15:14:53 +01:00
Rémi Oudin 65f777f00f Should get the objects and not the Manager 2018-02-26 15:04:26 +01:00
Rémi Oudin 236e40d359 Sanity check 2018-02-26 14:57:46 +01:00
Rémi Oudin 22017cea91 Typo in data u_u 2018-02-26 14:56:22 +01:00
Rémi Oudin 549c861908 Bug fixé 2018-02-26 14:38:26 +01:00
Rémi Oudin 517be1d822 Merge rdf branch 2018-02-26 14:11:06 +01:00
Rémi Oudin c4f63a92b2 Error in the merge, mea culpa 2018-02-26 14:01:29 +01:00
Rémi Oudin db067e56fc Typo 2018-02-26 13:59:34 +01:00
Rémi Oudin 33bdae96e4 merge commit from histories_tobast into histories_models 2018-02-26 12:59:38 +01:00
Rémi Oudin 526aad1364 Add interests 2018-02-26 12:33:23 +01:00
Théophile Bastian 02e91bb2b7 Fix function calls 2018-02-26 11:56:02 +01:00
Théophile Bastian 3e5fc2f9b3 Fix search engine URL generation 2018-02-26 11:49:24 +01:00
Théophile Bastian 45ddbff91a Crawling and histories: fix a lot of stuff 2018-02-26 11:49:24 +01:00
Théophile Bastian e6d587bffd Actually save to DB a created history 2018-02-26 11:49:24 +01:00
Théophile Bastian 8baf408e02 Use dict from data/nicknames_dict for nicknames 2018-02-26 11:49:24 +01:00
Théophile Bastian 6463e348ac Fix populate.sh exec path 2018-02-26 11:48:51 +01:00
Théophile Bastian 22064ebee3 Histories: xml import/export — untested
To be tested when history generation is available
2018-02-26 11:48:51 +01:00
Théophile Bastian a4de51b84a Crawl: do not use global SEARCH_ENGINES 2018-02-26 11:48:51 +01:00
Théophile Bastian 4f0148cb63 Crawler: use a random fingerprint 2018-02-26 11:48:51 +01:00
Théophile Bastian 4a8bd32516 Fix tor_runner import 2018-02-26 11:48:51 +01:00
Rémi Oudin 44cf26df8f It can be useful to save a new object 2018-02-26 11:42:45 +01:00
Rémi Oudin adb892ab7d Check if crawling a search engine 2018-02-26 11:12:36 +01:00
Rémi Oudin 15db8b4697 Change option name due to downgrade of aiohttp 2018-02-26 10:23:32 +01:00
Rémi Oudin d6b26c0a46 Better use of history 2018-02-26 10:05:33 +01:00
Rémi Oudin 8f5c4f3f0f Use datetimes 2018-02-26 09:49:24 +01:00
Rémi Oudin 71d9e18eec Add headers support 2018-02-25 23:56:51 +01:00