Commit graph

108 commits

Author SHA1 Message Date
89d1f8301a Remove duplicated url in history 2018-02-26 17:46:49 +01:00
379b53e6ce Fix printing in gen_history 2018-02-26 17:25:04 +01:00
c94841c17b Add gen_history django-admin command 2018-02-26 17:25:04 +01:00
97107d9bec Merge branch 'master' of git.tobast.fr:tobast/mpri-webdam 2018-02-26 17:12:26 +01:00
dedc66bb9d Bug fix 2018-02-26 17:12:19 +01:00
d3d04739e7 Add DuckDuckGo lite search engine to stock data
This search engine works better than the others
2018-02-26 17:10:18 +01:00
b88aeffd5a Helpful README 2018-02-26 17:09:05 +01:00
7c8ec7351c Merge branch 'master' of git.tobast.fr:tobast/mpri-webdam 2018-02-26 17:04:09 +01:00
2005c0f24f Add xml string gen 2018-02-26 17:03:27 +01:00
392e16b797 Merge branch 'histories_models' 2018-02-26 17:03:27 +01:00
185c1cf8a4 Fix XML generation 2018-02-26 17:00:53 +01:00
9dd1954067 Partial runner fix 2018-02-26 17:00:53 +01:00
04270e88c0 Bug fix 2018-02-26 17:00:12 +01:00
6bc64ceb7a Add requirement for aiohttp 2018-02-26 16:38:16 +01:00
15e0c2a11c Partial runner fix 2018-02-26 16:37:51 +01:00
2b07779f5c Bug fix 2018-02-26 16:37:32 +01:00
8cdc50c04e Fix stupid typo 2018-02-26 16:34:43 +01:00
22fa039f1b Remove debug print 2018-02-26 16:23:14 +01:00
e4ad8c7ce6 Towards a working XML export 2018-02-26 15:58:30 +01:00
67ad232533 Add a timeout to a single page retrieval 2018-02-26 15:42:36 +01:00
e140d4a8a7 Fix merge remanences 2018-02-26 15:37:05 +01:00
98fe69ba62 Real async crawling 2018-02-26 15:30:38 +01:00
968ff6d24c More robust crawling 2018-02-26 15:29:36 +01:00
5d4bd30e20 Exception handling 2018-02-26 15:15:03 +01:00
bdfa285e6b We do not want to use settings 2018-02-26 15:14:53 +01:00
65f777f00f Should get the objects and not the Manager 2018-02-26 15:04:26 +01:00
236e40d359 Sanity check 2018-02-26 14:57:46 +01:00
22017cea91 Typo in data u_u 2018-02-26 14:56:22 +01:00
549c861908 Bug fixé 2018-02-26 14:38:26 +01:00
517be1d822 Merge rdf branch 2018-02-26 14:11:06 +01:00
c4f63a92b2 Error in the merge, mea culpa 2018-02-26 14:01:29 +01:00
db067e56fc Typo 2018-02-26 13:59:34 +01:00
33bdae96e4 merge commit from histories_tobast into histories_models 2018-02-26 12:59:38 +01:00
526aad1364 Add interests 2018-02-26 12:33:23 +01:00
02e91bb2b7 Fix function calls 2018-02-26 11:56:02 +01:00
3e5fc2f9b3 Fix search engine URL generation 2018-02-26 11:49:24 +01:00
45ddbff91a Crawling and histories: fix a lot of stuff 2018-02-26 11:49:24 +01:00
e6d587bffd Actually save to DB a created history 2018-02-26 11:49:24 +01:00
8baf408e02 Use dict from data/nicknames_dict for nicknames 2018-02-26 11:49:24 +01:00
6463e348ac Fix populate.sh exec path 2018-02-26 11:48:51 +01:00
22064ebee3 Histories: xml import/export — untested
To be tested when history generation is available
2018-02-26 11:48:51 +01:00
a4de51b84a Crawl: do not use global SEARCH_ENGINES 2018-02-26 11:48:51 +01:00
4f0148cb63 Crawler: use a random fingerprint 2018-02-26 11:48:51 +01:00
4a8bd32516 Fix tor_runner import 2018-02-26 11:48:51 +01:00
44cf26df8f It can be useful to save a new object 2018-02-26 11:42:45 +01:00
adb892ab7d Check if crawling a search engine 2018-02-26 11:12:36 +01:00
15db8b4697 Change option name due to downgrade of aiohttp 2018-02-26 10:23:32 +01:00
d6b26c0a46 Better use of history 2018-02-26 10:05:33 +01:00
8f5c4f3f0f Use datetimes 2018-02-26 09:49:24 +01:00
71d9e18eec Add headers support 2018-02-25 23:56:51 +01:00