top of page
< Back

Generating Personalized Wordlists by Analyzing Target's Tweets

9 Aug 2019

Live Demo

Generating Personalized Wordlists by Analyzing Target's Tweets

Utku Sen

Abstract

Adversaries need to have a wordlist or combination-generation tool while conducting password guessing attacks. To narrow the combination pool, researchers developed a method named ”mask attack” where the attacker needs to assume a password’s structure. Even if it narrows the combination pool significantly, it’s still too large to use for online attacks or offline attacks with low hardware resources.


 


 In the real world, a password’s structure is an unknown value, just like the password itself. Even if we specify a password structure with masks, we are still brute forcing characters in the mask. When we analyzed Ashley Madison and Myspace wordlists, we saw that they are mostly consists of sequential alpha characters. Which means that there is a high probability that they are meaningful words. The first step is understanding if a letter sequence is a meaningful word in the English language. We can state that a letter sequence is an English word if it’s listed in an English lexicon. Wordnet (a lexical database for English created by Princeton University) is used as the lexicon. Our research shows that 30% of the Ashley Madison wordlist and 36% of Myspace wordlist contains meaningful English words.


 


 If we use all words in the Oxford English Directory, the combination pool will be 171,476. But 171,476 is a still big number for online attacks. We can reduce this number if we can identify what kind of words are usually chosen by people. According to experiments conducted by Carnegie Mellon and Carleton universities, most people are choosing words for their passwords based on personal topics such as hobbies, work, religion, sports, video games, etc. So if we can identify the candidate words from interest areas of a person, we can reduce the combination pool significantly. On Twitter, people tend to share posts mostly related to their area of interest. Because of that, Twitter is a good candidate to identify a user’s personal topics and generate related words about it to reduce the combination pool for password guessing attacks.


 


 Our tool, Rhodiola is developed to narrow the combination pool by creating a personalized wordlist for target people. It finds interest areas of a given user by analyzing his/her tweets, and builds a personalized wordlist. Wordlist consists of most used nouns&proper nouns, paired nouns&proper nouns, cities and years related to detected proper nouns. Example usage:


 


 python rhodiola.py --username elonmusk


 


 Example output:


 


 ...


 tesla


 car


 boring


 spacex


 falcon


 flamethrower


 coloradosprings


 tesla1856


 ...

bottom of page