Here are some estimated counts of the number of occurrences of various substrings, within the OT and PT scripts from BlueHarvest.net and IMSDb in the case of ROTS.
Long TL;DR: This approach is quick-and-dirty to estimate the number of occurrences using the standard tool,
grep, so there are important caveats. The count is of the number of lines in each web archive that match the search criteria. The search is case-insensitive. This should be a good
estimate of the number of occurrences, because lines are sort, especially in the case of dialog. Unfortunately, the IMSDb script breaks words in the middle, to limit HTML line size, because it uses word-wrap to format the HTML, instead of relying on a static line size. Also unfortunately, not only are both dialog and directions searched by this method, but also the raw HTML and other archive data is; however, the text of the scripts should dominate the matches, since the other data is expressed in computer language. [
ETA: It looks like it is possible to get text-only scripts using my browser. I should have gotten those to begin with, so that's a goof on my part. However, even if I reran this with text-only scripts, the process still wouldn't be perfect, because it wouldn't be restricted to just dialog.] Additionally, for two word phrases, it isn't precisely the same as counting all matching lines, because even if it took whitespace into account in a foolproof way (which for clarity I opted not to do), due to the limitation in the program it couldn't take into account line breaks anyway. Additionally, note that the web archives (which were fetched yesterday) do not necessarily reflect the final versions of the scripts or exactly what's on screen.
Medium TL;DR: Experienced UNIX users understand the strengths and limitations of the standard
grep tool.
Short TL;DR: It's offered as-is. If someone cares, they can pick up the baton and crunch more accurate data.
Code:
$ ls
BlueHarvest.net - Attack of the Clones Script.mht
BlueHarvest.net - Return of the Jedi script.mht
BlueHarvest.net - Star Wars_ A New Hope script.mht
BlueHarvest.net - The Empire Strikes Back script.mht
BlueHarvest.net - The Phantom Menace Script.mht
Star Wars_ Revenge of the Sith Script at IMSDb..mht
$ grep -ic jedi *
BlueHarvest.net - Attack of the Clones Script.mht:167
BlueHarvest.net - Return of the Jedi script.mht:43
BlueHarvest.net - Star Wars_ A New Hope script.mht:17
BlueHarvest.net - The Empire Strikes Back script.mht:20
BlueHarvest.net - The Phantom Menace Script.mht:96
Star Wars_ Revenge of the Sith Script at IMSDb..mht:247
$ grep -ic knight *
BlueHarvest.net - Attack of the Clones Script.mht:5
BlueHarvest.net - Return of the Jedi script.mht:3
BlueHarvest.net - Star Wars_ A New Hope script.mht:8
BlueHarvest.net - The Empire Strikes Back script.mht:1
BlueHarvest.net - The Phantom Menace Script.mht:8
Star Wars_ Revenge of the Sith Script at IMSDb..mht:2
$ grep -ic youngling *
BlueHarvest.net - Attack of the Clones Script.mht:3
BlueHarvest.net - Return of the Jedi script.mht:0
BlueHarvest.net - Star Wars_ A New Hope script.mht:0
BlueHarvest.net - The Empire Strikes Back script.mht:0
BlueHarvest.net - The Phantom Menace Script.mht:0
Star Wars_ Revenge of the Sith Script at IMSDb..mht:5
$ grep -ic padawan *
BlueHarvest.net - Attack of the Clones Script.mht:11
BlueHarvest.net - Return of the Jedi script.mht:0
BlueHarvest.net - Star Wars_ A New Hope script.mht:0
BlueHarvest.net - The Empire Strikes Back script.mht:0
BlueHarvest.net - The Phantom Menace Script.mht:4
Star Wars_ Revenge of the Sith Script at IMSDb..mht:2
$ grep -ic learner *
BlueHarvest.net - Attack of the Clones Script.mht:4
BlueHarvest.net - Return of the Jedi script.mht:0
BlueHarvest.net - Star Wars_ A New Hope script.mht:1
BlueHarvest.net - The Empire Strikes Back script.mht:0
BlueHarvest.net - The Phantom Menace Script.mht:2
Star Wars_ Revenge of the Sith Script at IMSDb..mht:1
$ grep -ic master *
BlueHarvest.net - Attack of the Clones Script.mht:97
BlueHarvest.net - Return of the Jedi script.mht:43
BlueHarvest.net - Star Wars_ A New Hope script.mht:14
BlueHarvest.net - The Empire Strikes Back script.mht:19
BlueHarvest.net - The Phantom Menace Script.mht:41
Star Wars_ Revenge of the Sith Script at IMSDb..mht:79
$ grep -ic order *
BlueHarvest.net - Attack of the Clones Script.mht:41
BlueHarvest.net - Return of the Jedi script.mht:23
BlueHarvest.net - Star Wars_ A New Hope script.mht:18
BlueHarvest.net - The Empire Strikes Back script.mht:19
BlueHarvest.net - The Phantom Menace Script.mht:22
Star Wars_ Revenge of the Sith Script at IMSDb..mht:50
$ grep -ic 'padawan learner' *
BlueHarvest.net - Attack of the Clones Script.mht:2
BlueHarvest.net - Return of the Jedi script.mht:0
BlueHarvest.net - Star Wars_ A New Hope script.mht:0
BlueHarvest.net - The Empire Strikes Back script.mht:0
BlueHarvest.net - The Phantom Menace Script.mht:2
Star Wars_ Revenge of the Sith Script at IMSDb..mht:1
$ grep -ic 'jedi knight' *
BlueHarvest.net - Attack of the Clones Script.mht:3
BlueHarvest.net - Return of the Jedi script.mht:3
BlueHarvest.net - Star Wars_ A New Hope script.mht:4
BlueHarvest.net - The Empire Strikes Back script.mht:0
BlueHarvest.net - The Phantom Menace Script.mht:8
Star Wars_ Revenge of the Sith Script at IMSDb..mht:2
$ grep -ic 'jedi master' *
BlueHarvest.net - Attack of the Clones Script.mht:4
BlueHarvest.net - Return of the Jedi script.mht:1
BlueHarvest.net - Star Wars_ A New Hope script.mht:0
BlueHarvest.net - The Empire Strikes Back script.mht:3
BlueHarvest.net - The Phantom Menace Script.mht:2
Star Wars_ Revenge of the Sith Script at IMSDb..mht:5
$ grep -ic 'jedi order' *
BlueHarvest.net - Attack of the Clones Script.mht:4
BlueHarvest.net - Return of the Jedi script.mht:0
BlueHarvest.net - Star Wars_ A New Hope script.mht:0
BlueHarvest.net - The Empire Strikes Back script.mht:0
BlueHarvest.net - The Phantom Menace Script.mht:0
Star Wars_ Revenge of the Sith Script at IMSDb..mht:6
$ exit