Pre-requsite:
1. Install python3
- Follow this link to download the latest python3 OS X package
- Run the package and follow the steps to install python3
- Once the installation is done, on your Terminal, run the following to confirm python 3.7.4 is installed
2. Install pip3
- Securly download the get-pip.py file from this link
- From the directory where the file was downloaded to, run the following command in the Terminal
Setup:
Install the following package:
pip3 install wikipedia wordcloud matplotlib
Python code:
Create a python file name: generatewc.py, and paste the following 20 lines:
import wikipedia
from wordcloud import WordCloud, STOPWORDS
import os
from PIL import Image
import numpy as np
currdir = os.path.dirname( __file__ )
def get_wiki(query):
title = wikipedia.search(query)[0]
page = wikipedia.page(title)
return page.content
def create_wordcloud(text):
mask = np.array(Image.open(os.path.join(currdir, "cloud.png")))
stopwords = set(STOPWORDS)
wc = WordCloud(background_color="pink",
mask = mask,
max_words=200,
stopwords=stopwords)
wc.generate(text)
wc.to_file(os.path.join(currdir, "wc.png"))
create_wordcloud(get_wiki("Sichuan cuisine"))
Explanation: What this code does is -
1. first function get_wiki(query) to generate the page content of a given query, for example: "Sichuan cuisine"
2. second function create_wordcloud(text) to generate the word cloud image of the most seen words in the wikipedia page retrieve from the first function
3. the image is saved in the same location of the generatewc.py as "wc.png"
Results:
No comments:
Post a Comment