Mastodon Wordcloud

Since leaving Twitter behind and moving over to Fosstodon.org (A Mastodon service), one of the fun things I missed was creating Word Clouds of my most used words. I wanted to be able to do the same to my data on Mastodon.

A few months ago I began looking at the Mastodon API and was initially overwhelmed with it. But perseverance and practice paid off and I now have a working prototype application.

You can take a look at my example on Github GIST (https://gist.github.com/vwillcox/ac2c7b8aad917bd297f1bdcaddc066f2)

import requests
import json
import time, argparse
import matplotlib.pyplot as py
from wordcloud import WordCloud,STOPWORDS
from PIL import Image
import numpy as np

parser = argparse.ArgumentParser()
parser.add_argument("-a", "--account", help="Handle to use", required=True)
parser.add_argument("-m", "--mask", help="Masking Image to use", required=True)
parser.add_argument("-o", "--output", help="Output File Name", required=True)
parser.add_argument("-k", "--key", help="API Access Token", required=True)
parser.add_argument("-t", "--transparent", help="Tansparent Image", required=True)
args = parser.parse_args()
config = vars(args)

accountname = args.account
accessToken = args.key
transparent = args.transparent

accounts = requests.get('https://fosstodon.org/api/v2/search?q='+accountname+'&resolve=true&limit=1', headers={'Authorization': 'Bearer '+accessToken})
statuses = json.loads(accounts.text)

accountID = (statuses['accounts'][0]['id'])


response = requests.get('https://fosstodon.org/api/v1/accounts/'+accountID+'/statuses', headers={'Authorization': 'Bearer '+accessToken})
statuses = json.loads(response.text)

maskingfilename = args.mask
wordcloudfile = args.output


tempwordfile="file.txt"
f=open (tempwordfile, "w+")
for status in statuses:
  f.write(str(status["content"]))
f.write("\n")
f.close

f = open(tempwordfile,"r")
words=f.read()
f.close()

stopwords = set(STOPWORDS)
stopwords.add('https')
stopwords.add('t')
stopwords.add('co')
stopwords.add('https://t.co')
stopwords.add('span')
stopwords.add('href')
stopwords.add('class')
stopwords.add('mention')
stopwords.add('hashtag')
stopwords.add('url')
stopwords.add('rel')
stopwords.add('tag')
stopwords.add('tags')
stopwords.add('fosstodon.org')
stopwords.add('fosstodon')
stopwords.add('org')
stopwords.add('mstdn')
stopwords.add('p')
stopwords.add('u')
stopwords.add('h')
stopwords.add('card')
stopwords.add('ca')
stopwords.add('br')
stopwords.add('sbb')
stopwords.add('joinin')
stopwords.add('_blank')
stopwords.add('noopener')
stopwords.add('ac2c7')
stopwords.add('nofollow')
stopwords.add('noreferrer')

if transparent == "yes":
   twitter_mask= np.array(Image.open(maskingfilename)) #sitr.jpg image name
   wCloud= WordCloud(
   margin=5,
   background_color=None,
   mask=twitter_mask,
   mode="RGBA",
   stopwords=stopwords
   ).generate(words)
   wCloud.to_file(wordcloudfile)

else:
   twitter_mask= np.array(Image.open(maskingfilename)) #sitr.jpg image name
   wCloud= WordCloud(
   margin=5,
   #background_color=None,
   #mode="RGBA",
   mask=twitter_mask,
   contour_width=2,
   contour_color='steelblue',
   stopwords=stopwords
   ).generate(words)
   wCloud.to_file(wordcloudfile)


Using the code

To run this code, in the terminal window you would run using a command as follows

python3 main.py -m masto.svg.png -o talktech040223-v5.png -a talktech -k <YOURAPIKEY> -t yes

What does this all mean?
the code uses arguments to run – these are the “-m” and “-o” options in the command line

-m indicates a masking file to use to create a shaped word cloud

-o indicates the name of the cloud image file to save to the disk

-a indicates the Mastodon user you want to create a word cloud for

-k is your API KEY

-t indicates if you want a transparent word cloud image or not (accepts yes or no)

To get an API key you can go to – https://fosstodon.org/settings/applications (for fosstodon accounts) you will have a similar option on your Mastodon server. The key you need once you have created an “Application” is the one entitled “Your access token”. NEVER EVER give this out as people will be able to use it to post to your mastodon feed and lots more!