Raymii.org
Quis custodiet ipsos custodes?Home | About | All pages | Cluster Status | RSS Feed
Complete word count analysis of Security Now, episode 1 trough 370.
Published: 09-09-2012 | Author: Remy van Elst | Text only version of this article
❗ This post is over twelve years old. It may no longer be up to date. Opinions may have changed.
Security Now is a podcast by Leo Laporte and Steve Gibson released on the Twit.tv network.
Steve pays to get the podcast transcribed, and the files are up over are grc.com.
Recently I removed all Google Ads from this site due to their invasive tracking, as well as Google Analytics. Please, if you found this content useful, consider a small donation using any of the options below:
I'm developing an open source monitoring app called Leaf Node Monitoring, for windows, linux & android. Go check it out!
Consider sponsoring me on Github. It means the world to me if you show your appreciation and you'll help pay the server costs.
You can also sponsor me by getting a Digital Ocean VPS. With this referral link you'll get $200 credit for 60 days. Spend $25 after your credit expires and I'll get $25!
I decided to run my analyzer over the complete podcast text archive. This is from episode 001 to 371.
Get the files:
for i in {001..371}; do curl http://www.grc.com/sn/sn-${i}.txt >> sn.txt; echo $i; done
Clean the files up:
cat sn.txt | LC_CTYPE=C tr -cd '[:alnum:] [:space:]' > csn.txt
Analyze the text file:
cat csn.txt | LC_CTYPE=C tr [:space:] '\n' | grep -v "^\s*$" | sort | uniq -c | sort -bnr > count-combined.txt
Result:
ed count-combined.txt
461930
1,20np
1 65548 the
2 49919 to
3 42284 that
4 40759 STEVE
5 40065 I
6 39496 a
7 35321 of
8 31706 and
9 30845 it
10 29930 is
11 24634 you
12 22213 And
13 20365 in
14 16467 this
15 14406 was
16 13811 So
17 13761 its
18 13711 for
19 12847 have
20 11599 on
Steve only
cat sn.txt | grep "STEVE:" > stonly.txt
cat stonly.txt | LC_CTYPE=C tr -cd '[:alnum:] [:space:]' > stonlyclean.txt
cat stonlyclean.txt | LC_CTYPE=C tr [:space:] '\n' | grep -v "^\s*$" | sort | uniq -c | sort -bnr > sto.txt
Result
ed sto.txt
461930
1,20np
1 65548 the
2 49919 to
3 42284 that
4 40759 STEVE
5 40065 I
6 39496 a
7 35321 of
8 31706 and
9 30845 it
10 29930 is
11 24634 you
12 22213 And
13 20365 in
14 16467 this
15 14406 was
16 13811 So
17 13761 its
18 13711 for
19 12847 have
20 11599 on
Leo Only
cat sn.txt | grep "LEO:" > leoonly.txt
cat leoonly.txt | LC_CTYPE=C tr -cd '[:alnum:] [:space:]' > leoonlyclean.txt
cat leoonlyclean.txt | LC_CTYPE=C tr [:space:] '\n' | grep -v "^\s*$" | sort | uniq -c | sort -bnr > leoc.txt
Result
ed leoc.txt
367236
1,20np
1 40349 LEO
2 30161 the
3 25301 to
4 24623 I
5 23060 a
6 19027 you
7 17115 it
8 16441 that
9 15115 of
10 13676 and
11 12256 is
12 9785 in
13 8689 And
14 8282 this
15 7633 have
16 7552 on
17 7094 for
18 6492 its
19 6032 do
20 5922 know
Tags: analyze
, articles
, bash
, leo-laporte
, podcast
, security-now
, steve-gibson