Macs in Chemistry

Insanely great science

 

Jupyter notebook to create Wordcloud of tweets

I've often wanted to try creating a word cloud and when Noel O'Boyle collected together all the tweets from the Sheffield Conf on Chemoinformatics this seemed a good opportunity.

Relive the Sheffield Conf on Chemoinformatics with these #shef2019 tweets I've pulled down from Twitter, link to tweet.

The Jupyter notebook used to create the word cloud is shown below, it uses the excellent word cloud generator word_cloud. You will need to download the text from the link provided in the tweet.

In [1]:
# Python program to generate WordCloud 

# importing all necessery modules 
from wordcloud import WordCloud, STOPWORDS #https://github.com/amueller/word_cloud
import matplotlib.pyplot as plt

%matplotlib inline

In [2]:
FilePath = '/path_to_file/wordcloud/ShefTwittertext.txt'
FilePath #path to file

Out[2]:
'/path_to_file/wordcloud/ShefTwittertext.txt'

In [3]:
with open(FilePath, 'r') as content_file:
    content = content_file.read()

In [4]:
comment_words = ' '
stopwords = set(STOPWORDS) 

In [5]:
content = str(content)

In [6]:
tokens = content.split()
#tokens

In [7]:
# Converts each token into lowercase 
for i in range(len(tokens)): 
    tokens[i] = tokens[i].lower() 

In [8]:
# remove stopwords
for words in tokens: 
    if words not in stopwords: 
        comment_words = comment_words + words + ' '

In [ ]:

In [9]:
counts = dict()
for word in comment_words.split():
    if word in counts:
        counts[word] += 1
    else:
        counts[word] = 1

#counts

In [10]:
wordcloud = WordCloud(width = 800, height = 800, 
                background_color ='white', 
                max_words=1000,
                relative_scaling=0.21,
                stopwords = stopwords, 
                min_font_size = 10).generate_from_frequencies(counts)

In [ ]:
 

In [11]:
# plot the WordCloud image                        
plt.figure(figsize = (8, 8), facecolor = None) 
plt.imshow(wordcloud) 
plt.axis("off") 
plt.tight_layout(pad = 0) 

plt.show()

In [12]:
plt.imshow(wordcloud, interpolation='bilinear')
plt.axis("off")
plt.savefig('test.png', dpi=1000)

In [ ]:
 

In [ ]:
 

Last Updated 27 June 2019