Data Mining Project: Word Frequency

The following tables reveal the word frequency of “Roughing it in the Bush” and “An Excursion to Canada”. This tool is very important because it allows focus on the use of single words and their context.

Roughing it in the Bush: Word Frequency

Words in the Entire Corpus. This tool shows overall word frequencies for the entire corpus as well as information about how word frequencies are spread out over documents within the corpus. Hover over column headers and buttons for more information.

Voyant Tools, Stéfan Sinclair & Geoffrey Rockwell (©2011) v. 1.0 beta (4302)

Frequencies Count Z-Score Difference Mean Std. Dev. Peakedness Skew Trend
old 402 2.44 22.3 0.000
said 334 2.02 18.5 0.000
little 304 1.83 16.9 0.000
man 295 1.77 16.4 0.000
like 267 1.60 14.8 0.000
country 264 1.58 14.7 0.000
great 260 1.55 14.4 0.000
time 253 1.51 14.1 0.000
house 243 1.44 13.5 0.000
day 238 1.41 13.2 0.000
long 235 1.39 13.1 0.000
good 219 1.29 12.2 0.000
moodie 210 1.24 11.7 0.000
land 207 1.22 11.5 0.000
canada 195 1.14 10.8 0.000
home 187 1.09 10.4 0.000
poor 187 1.09 10.4 0.000
night 185 1.08 10.3 0.000
heart 177 1.03 9.8 0.000
mr 175 1.02 9.7 0.000
children 173 1.00 9.6 0.000
large 168 0.97 9.3 0.000
make 168 0.97 9.3 0.000
came 162 0.93 9.0 0.000
people 161 0.93 8.9 0.000
thought 160 0.92 8.9 0.000
mrs 158 0.91 8.8 0.000
years 158 0.91 8.8 0.000
husband 155 0.89 8.6 0.000
young 155 0.89 8.6 0.000
left 153 0.88 8.5 0.000
way 153 0.88 8.5 0.000
just 144 0.82 8.0 0.000
woods 142 0.81 7.9 0.000
lake 140 0.80 7.8 0.000
life 139 0.79 7.7 0.000
eyes 131 0.74 7.3 0.000
place 128 0.72 7.1 0.000
woman 128 0.72 7.1 0.000
away 127 0.71 7.1 0.000
come 127 0.71 7.1 0.000
new 124 0.70 6.9 0.000
oh 116 0.65 6.4 0.000
shall 116 0.65 6.4 0.000
cold 114 0.63 6.3 0.000
tom 114 0.63 6.3 0.000
gave 113 0.63 6.3 0.000
small 111 0.61 6.2 0.000
child 110 0.61 6.1 0.000
fine 110 0.61 6.1 0.000

An Excursion to Canada: Word Frequency

Words in the Entire Corpus. This tool shows overall word frequencies for the entire corpus as well as information about how word frequencies are spread out over documents within the corpus. Hover over column headers and buttons for more information.

Voyant Tools, Stéfan Sinclair & Geoffrey Rockwell (©2011) v. 1.0 beta (4302)

Frequencies Count Z-Score Difference Mean Std. Dev. Peakedness Skew Trend
st 82 1.75 29.0 0.000
river 79 1.68 28.0 0.000
miles 75 1.59 26.6 0.000
quebec 71 1.50 25.2 0.000
french 69 1.45 24.4 0.000
canada 67 1.41 23.7 0.000
like 60 1.24 21.3 0.000
country 49 0.99 17.4 0.000
saw 48 0.97 17.0 0.000
far 45 0.90 15.9 0.000
english 42 0.83 14.9 0.000
said 41 0.81 14.5 0.000
called 40 0.78 14.2 0.000
lawrence 40 0.78 14.2 0.000
little 40 0.78 14.2 0.000
new 40 0.78 14.2 0.000
great 38 0.74 13.5 0.000
feet 36 0.69 12.8 0.000
heard 35 0.67 12.4 0.000
side 35 0.67 12.4 0.000
long 34 0.65 12.0 0.000
man 33 0.62 11.7 0.000
city 32 0.60 11.3 0.000
old 32 0.60 11.3 0.000
good 31 0.58 11.0 0.000
way 31 0.58 11.0 0.000
men 30 0.55 10.6 0.000
church 28 0.51 9.9 0.000
house 28 0.51 9.9 0.000
inhabitants 28 0.51 9.9 0.000
day 27 0.48 9.6 0.000
montreal 27 0.48 9.6 0.000
england 26 0.46 9.2 0.000
got 26 0.46 9.2 0.000
half 26 0.46 9.2 0.000
town 26 0.46 9.2 0.000
water 26 0.46 9.2 0.000
appeared 25 0.44 8.9 0.000
looked 25 0.44 8.9 0.000
small 25 0.44 8.9 0.000
thought 25 0.44 8.9 0.000
time 25 0.44 8.9 0.000
having 24 0.42 8.5 0.000
make 24 0.42 8.5 0.000
went 24 0.42 8.5 0.000
fall 23 0.39 8.1 0.000
falls 23 0.39 8.1 0.000
place 23 0.39 8.1 0.000
seen 23 0.39 8.1 0.000

One thought on “Data Mining Project: Word Frequency

Leave a Reply

Your email address will not be published. Required fields are marked *