Skip to main content

Raymii.org Raymii.org Logo

Quis custodiet ipsos custodes?
Home | About | All pages | Cluster Status | RSS Feed

Complete word count analysis of Security Now, episode 1 trough 370.

Published: 09-09-2012 | Author: Remy van Elst | Text only version of this article


❗ This post is over twelve years old. It may no longer be up to date. Opinions may have changed.


Security Now is a podcast by Leo Laporte and Steve Gibson released on the Twit.tv network.

Steve pays to get the podcast transcribed, and the files are up over are grc.com.

Recently I removed all Google Ads from this site due to their invasive tracking, as well as Google Analytics. Please, if you found this content useful, consider a small donation using any of the options below:

I'm developing an open source monitoring app called Leaf Node Monitoring, for windows, linux & android. Go check it out!

Consider sponsoring me on Github. It means the world to me if you show your appreciation and you'll help pay the server costs.

You can also sponsor me by getting a Digital Ocean VPS. With this referral link you'll get $200 credit for 60 days. Spend $25 after your credit expires and I'll get $25!

I decided to run my analyzer over the complete podcast text archive. This is from episode 001 to 371.

Get the files:

for i in {001..371}; do curl http://www.grc.com/sn/sn-${i}.txt >> sn.txt; echo $i; done

Clean the files up:

cat sn.txt | LC_CTYPE=C tr -cd '[:alnum:] [:space:]' > csn.txt

Analyze the text file:

cat csn.txt | LC_CTYPE=C tr [:space:] '\n' | grep -v "^\s*$" | sort | uniq -c | sort -bnr > count-combined.txt

Result:

ed count-combined.txt 
461930
1,20np
1       65548 the
2       49919 to
3       42284 that
4       40759 STEVE
5       40065 I
6       39496 a
7       35321 of
8       31706 and
9       30845 it
10      29930 is
11      24634 you
12      22213 And
13      20365 in
14      16467 this
15      14406 was
16      13811 So
17      13761 its
18      13711 for
19      12847 have
20      11599 on

Full result

Steve only

cat sn.txt | grep "STEVE:" > stonly.txt     

cat stonly.txt | LC_CTYPE=C tr -cd '[:alnum:] [:space:]' > stonlyclean.txt

cat stonlyclean.txt | LC_CTYPE=C tr [:space:] '\n' | grep -v "^\s*$" | sort | uniq -c | sort -bnr > sto.txt

Result

ed sto.txt 
461930
1,20np
1       65548 the
2       49919 to
3       42284 that
4       40759 STEVE
5       40065 I
6       39496 a
7       35321 of
8       31706 and
9       30845 it
10      29930 is
11      24634 you
12      22213 And
13      20365 in
14      16467 this
15      14406 was
16      13811 So
17      13761 its
18      13711 for
19      12847 have
20      11599 on 

Steve only

Leo Only

cat sn.txt | grep "LEO:" > leoonly.txt     

cat leoonly.txt | LC_CTYPE=C tr -cd '[:alnum:] [:space:]' > leoonlyclean.txt

cat leoonlyclean.txt | LC_CTYPE=C tr [:space:] '\n' | grep -v "^\s*$" | sort | uniq -c | sort -bnr > leoc.txt

Result

ed leoc.txt 
367236
1,20np
1       40349 LEO
2       30161 the
3       25301 to
4       24623 I
5       23060 a
6       19027 you
7       17115 it
8       16441 that
9       15115 of
10      13676 and
11      12256 is
12      9785 in
13      8689 And
14      8282 this
15      7633 have
16      7552 on
17      7094 for
18      6492 its
19      6032 do
20      5922 know    

Leo only

Tags: analyze , articles , bash , leo-laporte , podcast , security-now , steve-gibson