• Welcome! The TrekBBS is the number one place to chat about Star Trek with like-minded fans.
    If you are not already a member then please register an account and join in the discussion!

Lines per Character

Landru47

Ensign
Red Shirt
Not sure if this has been done before, but here's a breakdown of the percentage of spoken lines per character:

Code:
-----------------Kirk---Spock---McCoy---Scott---Uhura---Sulu
Season-1--------30.37---15.48---6.92----2.66----2.44----2.97
Season-2--------31.60---15.06---9.52----5.36----2.89----2.00
Season-3--------30.45---15.93---9.89----6.24----2.27----2.25
Movie-(2009)----24.30---12.15---7.52----2.99----5.69----3.57

If there's any interest I could break it down further by episode.
 
How did you go about compiling this chart?

Mainly through the command line. I downloaded scripts from http://www.chakoteya.net/startrek/ and then used grep and wc to count all the lines.

Edit: In case anyone wants to double check my work, here's what I did:

Code:
# Download the scripts
wget -e robots=off --wait=1 --limit-rate=20k -r http://www.chakoteya.net/startrek/
# Find all of Spock's lines and display the result. You'll have to manually add the results together.
grep -o 'SPOCK:' {1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,16b,17,18,19,20,21,22,23,24,25,26,27,28,29}.htm | wc -w
grep -o 'SPOCK \[OC]:' {1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,16b,17,18,19,20,21,22,23,24,25,26,27,28,29}.htm | wc -w # note that this result needs to be divided by two
# Find the total number of lines
grep -o ':' {1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,16b,17,18,19,20,21,22,23,24,25,26,27,28,29}.htm | wc -w

Just replace SPOCK with whoever and the 1,2,3...etc. to whichever episodes you want. There might be a more efficient way to do it, but I'm fairly new to this kinda stuff. I actually did this for a text analysis class.
 
Last edited:
I've seen individual line counts before, but not, I think, as percentages. Kirk with the lion's share is no surprise. It is odd that Spock drops a bit in Season 2, since he was the breakout character and they were supposed to be emphasizing him more. Sulu's drop after Season 1 shows the effect Chekov (and Takei's lengthy absence) had on the character.
 
Sadly, meaningless until you define your terms. What constitutes a "line"? A dialog block (CHARACTER NAME:)?
 
Sadly, meaningless until you define your terms. What constitutes a "line"? A dialog block (CHARACTER NAME:)?

If you look at the code, I defined a line as any time a character starts speaking in the script. For example, any time that SPOCK: and SPOCK [OC]: appear in this http://www.chakoteya.net/startrek/44.htm, that's counted as a line. It could be 1 word or 50.

I may go back and look at words per line, but like I said I'm pretty new to this stuff and would have to do some research on how to do that.
 
This is a fascinating project, but as Maurice points out, your methodology is flawed. As I understand it, a page-long monologue by Kirk receives the same weight as a two word line from Uhura.
 
This is a fascinating project, but as Maurice points out, your methodology is flawed. As I understand it, a page-long monologue by Kirk receives the same weight as a two word line from Uhura.

Still its some indication.
Would word count percentage be a better method?

Wasn't Shatner using line count as his measure?
 
How did you go about compiling this chart?

Mainly through the command line. I downloaded scripts from http://www.chakoteya.net/startrek/ and then used grep and wc to count all the lines.

Edit: In case anyone wants to double check my work, here's what I did:

Code:
# Download the scripts
wget -e robots=off --wait=1 --limit-rate=20k -r http://www.chakoteya.net/startrek/
# Find all of Spock's lines and display the result. You'll have to manually add the results together.
grep -o 'SPOCK:' {1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,16b,17,18,19,20,21,22,23,24,25,26,27,28,29}.htm | wc -w
grep -o 'SPOCK \[OC]:' {1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,16b,17,18,19,20,21,22,23,24,25,26,27,28,29}.htm | wc -w # note that this result needs to be divided by two
# Find the total number of lines
grep -o ':' {1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,16b,17,18,19,20,21,22,23,24,25,26,27,28,29}.htm | wc -w
Just replace SPOCK with whoever and the 1,2,3...etc. to whichever episodes you want. There might be a more efficient way to do it, but I'm fairly new to this kinda stuff. I actually did this for a text analysis class.

Very cool! Thanks for sharing! If you can manage it, a second analysis using word count would also be interesting.

Anyone else surprised that Spock's percentage stays more or less unchanged? It always feels to me like he gets more emphasis in the second season, but the data don't seem to bear that out.
 
This is a fascinating project, but as Maurice points out, your methodology is flawed. As I understand it, a page-long monologue by Kirk receives the same weight as a two word line from Uhura.

Counting words would be a better indication of relative "weight" of character participation per episode. It probably best to try a few different measures and compare the results. If they more or less line up, you've got your answer. :)
 
This is a fascinating project, but as Maurice points out, your methodology is flawed. As I understand it, a page-long monologue by Kirk receives the same weight as a two word line from Uhura.

True but does this method also take into account Shatner's pauses/fast spoken style of acting?

Mr.....Scott.....how....longuntilwehave....warp....drive?

How many lines would that be? :p
 
Shatner's cadence has been greatly exaggerated over the years. I think Maurice's approach would be the most useful -- assuming the coding could be done without any difficulty. I certainly don't have the skill set to do it.
 
I did one here for all series and movies that is based on word count. Further down I highlighted just the original gang.
I'm not sure of the purpose of this exercise but maybe you'd just use TOS episodes.
I also don't think its fair to compare TOS or ENT against TNG, VOY or DS9 unless you are doing %.
I mean I'm sure Picard spoke the most considering he had an extra 4 seasons on Kirk and 3 on Archer. But say Janeway vs Picard is a reasonable comparison.
 
Shatner's cadence has been greatly exaggerated over the years. I think Maurice's approach would be the most useful -- assuming the coding could be done without any difficulty. I certainly don't have the skill set to do it.

Ugh. I HATE how everyone assumes Shatner talked in the series now. When you go back and watch he only rarely resembles the now famous parody.
 
If you are not already a member then please register an account and join in the discussion!

Sign up / Register


Back
Top