We talked about how you can see how words like "freak" or "nonplussed" are being used today thanks to online corpora....
interesting.... what do you think?
---------------
CORPORA: 100-385 million words each: free online access: "ONLINE CORPORA
The following are some of the freely-available linguistic corpora that have been created by Mark Davies, Professor of Corpus Linguistics at Brigham Young University."
This is my favorite:
http://www.americancorpus.org/The corpus is composed of more than 385 million words in more than 150,000 texts, including 20 million words each year from 1990-2008. For each year (and therefore overall, as well), the corpus is evenly divided between the five genres of spoken, fiction, popular magazines, newspapers, and academic journals. The texts come from a variety of sources: -
Spoken: (79 million words) Transcripts of unscripted conversation from more than 150 different TV and radio programs (examples: All Things Considered (NPR), Newshour (PBS), Good Morning America (ABC), Today Show (NBC), 60 Minutes (CBS), Hannity and Colmes (Fox), Jerry Springer, etc). [See notes on the naturalness and authenticity of the language from these transcripts).
-
Fiction: (76 million words) Short stories and plays from literary magazines, children’s magazines, popular magazines, first chapters of first edition books 1990-present, and movie scripts.
-
Popular Magazines: (81 million words) Nearly 100 different magazines, with a good mix (overall, and by year) between specific domains (news, health, home and gardening, women, financial, religion, sports, etc). A few examples are Time, Men’s Health, Good Housekeeping, Cosmopolitan, Fortune, Christian Century, Sports Illustrated, etc.
-
Newspapers: (76 million words) Ten newspapers from across the US, including: USA Today, New York Times, Atlanta Journal Constitution, San Francisco Chronicle, etc. In most cases, there is a good mix between different sections of the newspaper, such as local news, opinion, sports, financial, etc.
-
Academic Journals: (76 million words) Nearly 100 different peer-reviewed journals. These were selected to cover the entire range of the Library of Congress classification system (e.g. a certain percentage from B (philosophy, psychology, religion), D (world history), K (education), T (technology), etc.), both overall and by number of words per year"