You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
|
4 years ago | |
---|---|---|
trump_tweet_data_archive@4398599156 | 4 years ago | |
.gitignore | 4 years ago | |
.gitmodules | 4 years ago | |
LICENSE | 4 years ago | |
README.md | 4 years ago | |
__init__.py | 4 years ago | |
bootstrap.py | 4 years ago | |
count_words.py | 4 years ago | |
generate_list.py | 4 years ago |
README.md
Trump vocabulary
NOTE: this was written in a few minutes without bothering with clean and robust code.
This code goes through the tweets of Donald Trump and produces a ranked list of words used.
The result (not much updated, though) can be found here.
Install
Clone this reopsitory with submodules: git clone --recurse-submodules
Alternatively, if you already cloned the repo, you can run
git submodule update --init --depth 1
Get a shell
You can explore the data in a shell by using count_words.py
as an init script for
your favorite shell, eg.
ipython -i count_words.py
The following will be available to you as variables:
tweets
: the list of all tweets ever,occur
: python dictionary of occurrences of words in Trump's tweetsranked
: ranked list of occurrences of words in Trump's tweets
Generating the list
Simply run
python ./generate_list.py [OUTPUT_FILE]
If you omit OUTPUT_FILE
, the list will be generated to trumprank.txt
.