|
89d1f8301a
|
Remove duplicated url in history
|
2018-02-26 17:46:49 +01:00 |
|
|
379b53e6ce
|
Fix printing in gen_history
|
2018-02-26 17:25:04 +01:00 |
|
|
c94841c17b
|
Add gen_history django-admin command
|
2018-02-26 17:25:04 +01:00 |
|
|
97107d9bec
|
Merge branch 'master' of git.tobast.fr:tobast/mpri-webdam
|
2018-02-26 17:12:26 +01:00 |
|
|
dedc66bb9d
|
Bug fix
|
2018-02-26 17:12:19 +01:00 |
|
|
d3d04739e7
|
Add DuckDuckGo lite search engine to stock data
This search engine works better than the others
|
2018-02-26 17:10:18 +01:00 |
|
|
b88aeffd5a
|
Helpful README
|
2018-02-26 17:09:05 +01:00 |
|
|
7c8ec7351c
|
Merge branch 'master' of git.tobast.fr:tobast/mpri-webdam
|
2018-02-26 17:04:09 +01:00 |
|
|
2005c0f24f
|
Add xml string gen
|
2018-02-26 17:03:27 +01:00 |
|
|
392e16b797
|
Merge branch 'histories_models'
|
2018-02-26 17:03:27 +01:00 |
|
|
185c1cf8a4
|
Fix XML generation
|
2018-02-26 17:00:53 +01:00 |
|
|
9dd1954067
|
Partial runner fix
|
2018-02-26 17:00:53 +01:00 |
|
|
04270e88c0
|
Bug fix
|
2018-02-26 17:00:12 +01:00 |
|
|
6bc64ceb7a
|
Add requirement for aiohttp
|
2018-02-26 16:38:16 +01:00 |
|
|
15e0c2a11c
|
Partial runner fix
|
2018-02-26 16:37:51 +01:00 |
|
|
2b07779f5c
|
Bug fix
|
2018-02-26 16:37:32 +01:00 |
|
|
8cdc50c04e
|
Fix stupid typo
|
2018-02-26 16:34:43 +01:00 |
|
|
22fa039f1b
|
Remove debug print
|
2018-02-26 16:23:14 +01:00 |
|
|
e4ad8c7ce6
|
Towards a working XML export
|
2018-02-26 15:58:30 +01:00 |
|
|
67ad232533
|
Add a timeout to a single page retrieval
|
2018-02-26 15:42:36 +01:00 |
|
|
e140d4a8a7
|
Fix merge remanences
|
2018-02-26 15:37:05 +01:00 |
|
|
98fe69ba62
|
Real async crawling
|
2018-02-26 15:30:38 +01:00 |
|
|
968ff6d24c
|
More robust crawling
|
2018-02-26 15:29:36 +01:00 |
|
|
5d4bd30e20
|
Exception handling
|
2018-02-26 15:15:03 +01:00 |
|
|
bdfa285e6b
|
We do not want to use settings
|
2018-02-26 15:14:53 +01:00 |
|
|
65f777f00f
|
Should get the objects and not the Manager
|
2018-02-26 15:04:26 +01:00 |
|
|
236e40d359
|
Sanity check
|
2018-02-26 14:57:46 +01:00 |
|
|
22017cea91
|
Typo in data u_u
|
2018-02-26 14:56:22 +01:00 |
|
|
549c861908
|
Bug fixé
|
2018-02-26 14:38:26 +01:00 |
|
|
517be1d822
|
Merge rdf branch
|
2018-02-26 14:11:06 +01:00 |
|
|
c4f63a92b2
|
Error in the merge, mea culpa
|
2018-02-26 14:01:29 +01:00 |
|
|
db067e56fc
|
Typo
|
2018-02-26 13:59:34 +01:00 |
|
|
33bdae96e4
|
merge commit from histories_tobast into histories_models
|
2018-02-26 12:59:38 +01:00 |
|
|
526aad1364
|
Add interests
|
2018-02-26 12:33:23 +01:00 |
|
|
02e91bb2b7
|
Fix function calls
|
2018-02-26 11:56:02 +01:00 |
|
|
3e5fc2f9b3
|
Fix search engine URL generation
|
2018-02-26 11:49:24 +01:00 |
|
|
45ddbff91a
|
Crawling and histories: fix a lot of stuff
|
2018-02-26 11:49:24 +01:00 |
|
|
e6d587bffd
|
Actually save to DB a created history
|
2018-02-26 11:49:24 +01:00 |
|
|
8baf408e02
|
Use dict from data/nicknames_dict for nicknames
|
2018-02-26 11:49:24 +01:00 |
|
|
6463e348ac
|
Fix populate.sh exec path
|
2018-02-26 11:48:51 +01:00 |
|
|
22064ebee3
|
Histories: xml import/export — untested
To be tested when history generation is available
|
2018-02-26 11:48:51 +01:00 |
|
|
a4de51b84a
|
Crawl: do not use global SEARCH_ENGINES
|
2018-02-26 11:48:51 +01:00 |
|
|
4f0148cb63
|
Crawler: use a random fingerprint
|
2018-02-26 11:48:51 +01:00 |
|
|
4a8bd32516
|
Fix tor_runner import
|
2018-02-26 11:48:51 +01:00 |
|
|
44cf26df8f
|
It can be useful to save a new object
|
2018-02-26 11:42:45 +01:00 |
|
|
adb892ab7d
|
Check if crawling a search engine
|
2018-02-26 11:12:36 +01:00 |
|
|
15db8b4697
|
Change option name due to downgrade of aiohttp
|
2018-02-26 10:23:32 +01:00 |
|
|
d6b26c0a46
|
Better use of history
|
2018-02-26 10:05:33 +01:00 |
|
|
8f5c4f3f0f
|
Use datetimes
|
2018-02-26 09:49:24 +01:00 |
|
|
71d9e18eec
|
Add headers support
|
2018-02-25 23:56:51 +01:00 |
|