<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Science and Language</title>
	<atom:link href="http://kochanski.org/blog/?feed=rss2" rel="self" type="application/rss+xml" />
	<link>http://kochanski.org/blog</link>
	<description>Slow blogging from the research side</description>
	<lastBuildDate>Thu, 17 May 2012 12:33:18 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.2</generator>
		<item>
		<title>&#8220;On Bullshit&#8221;</title>
		<link>http://kochanski.org/blog/?p=730</link>
		<comments>http://kochanski.org/blog/?p=730#comments</comments>
		<pubDate>Thu, 17 May 2012 12:12:17 +0000</pubDate>
		<dc:creator>gpk</dc:creator>
				<category><![CDATA[academics]]></category>
		<category><![CDATA[language]]></category>
		<category><![CDATA[bullshit]]></category>
		<category><![CDATA[communication]]></category>
		<category><![CDATA[philosophy]]></category>
		<category><![CDATA[truth]]></category>

		<guid isPermaLink="false">http://kochanski.org/blog/?p=730</guid>
		<description><![CDATA[&#8220;On Bullshit&#8221; by Harry B.C. Frankfurt is a beautiful, tiny, insightful book that defines bullshit. &#8220;Why Define bullshit&#8221;, you might ask?    Why not?  It&#8217;s probably 90% of what you can read on the Internet.   It&#8217;s probably a large chunk of what you hear from your upper management at work.  It&#8217;s probably 90% of [...]]]></description>
			<content:encoded><![CDATA[<p>&#8220;<a href="http://books.google.com/books/about/On_Bullshit.html?id=bFpzNItiO7oC">On Bullshit</a>&#8221; by Harry B.C. Frankfurt is a beautiful, tiny, insightful book that defines bullshit.</p>
<p>&#8220;Why Define bullshit&#8221;, you might ask?    Why not?  It&#8217;s probably 90% of what you can read on the Internet.   It&#8217;s probably a large chunk of what you hear from your upper management at work.  It&#8217;s probably 90% of political advertisements, and it&#8217;d probably be 90% of commercial advertising if there weren&#8217;t any &#8220;truth in advertising&#8221; laws.   <span style="color: #008000;">[Those laws are why drug companies give you a feel-good video that implies you will be floating through fields of flowers after a dose of Fluoxidyne, while a speed reader lists side-effects and says "take this only on the recommendation of your physician." -- No law regulates the images and implications in the ads, just the words and the formal claims.   Absent those laws, most companies would probably tell you</span> <a href="http://www.hairraisingstories.com/Products/HALL_HR.html">wondrous bullshit</a> <span style="color: #008000;">about their products.]</span></p>
<p>So, it&#8217;s entirely reasonable for a philosopher to worry about what separates bullshit from truth and lies.   Clearly, bullshit isn&#8217;t the same thing as telling the truth, and it&#8217;s not quite the same thing as lying, either.   At 67 tiny pages, it&#8217;s probably the smallest philosophy book you&#8217;ll ever read.   <span style="color: #008000;">[That may be because he stopped writing before he started producing bullshit.]</span></p>
]]></content:encoded>
			<wfw:commentRss>http://kochanski.org/blog/?feed=rss2&#038;p=730</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Does phonology explain speech?</title>
		<link>http://kochanski.org/blog/?p=711</link>
		<comments>http://kochanski.org/blog/?p=711#comments</comments>
		<pubDate>Fri, 13 Apr 2012 01:35:38 +0000</pubDate>
		<dc:creator>gpk</dc:creator>
				<category><![CDATA[academics]]></category>
		<category><![CDATA[brains]]></category>
		<category><![CDATA[ethics]]></category>
		<category><![CDATA[language]]></category>
		<category><![CDATA[science and how it works]]></category>
		<category><![CDATA[writing]]></category>
		<category><![CDATA[brain]]></category>
		<category><![CDATA[citations]]></category>
		<category><![CDATA[errors]]></category>
		<category><![CDATA[evidence]]></category>
		<category><![CDATA[Language]]></category>
		<category><![CDATA[phonetics]]></category>
		<category><![CDATA[phonology]]></category>
		<category><![CDATA[prediction]]></category>
		<category><![CDATA[research]]></category>
		<category><![CDATA[science]]></category>
		<category><![CDATA[self-delusion]]></category>
		<category><![CDATA[structure]]></category>
		<category><![CDATA[teaching]]></category>
		<category><![CDATA[underlying]]></category>
		<category><![CDATA[who-watches-the-watchers]]></category>

		<guid isPermaLink="false">http://kochanski.org/blog/?p=711</guid>
		<description><![CDATA[Just recently, I looked at an abstract of a phonetics paper and saw this: Many theories of speech production see phonetic implementation of underlying phonological structures as an automatic process governed by universal and sometimes language-specific constraints. Such assumption for example underlies the phonological account of&#8230; I used to  grudgingly sign my name to such sentences [...]]]></description>
			<content:encoded><![CDATA[<p>Just recently, I looked at an abstract of a phonetics paper and saw this:</p>
<p style="padding-left: 30px;">Many theories of speech production see phonetic implementation of underlying phonological structures as an automatic process governed by universal and sometimes language-specific constraints. Such assumption for example underlies the phonological account of&#8230;</p>
<p>I used to  grudgingly sign my name to such sentences back when I was a linguist.  But now I make my money in other ways and I have no reason to pay attention to the Linguistic community.   So, let&#8217;s take a look at the text and see what&#8217;s right and what&#8217;s nonsense.</p>
<p>&#8220;<strong>Many theories of speech production &#8230; phonetic implementation of &#8230; phonological&#8230;</strong>&#8221;   <span style="color: #000080;">The trouble is that &#8220;many&#8221; is the wrong word because there are<em> no</em> Linguistic theories of speech production.  Linguists, when being careful, tend to call their things &#8220;paradigms&#8221; or similar terms, in recognition of the fact that they don&#8217;t normally predict what is going to happen.  <span style="color: #008000;">[Often, they are intended as a way of looking at the data or organizing it.]</span></span></p>
<ul>
<li>In this context, a theory is something that predicts your phonetic implementation (i.e. how you say something, in terms of sounds or tongue positions) from some sort of phonology.   When one says the word &#8220;theory&#8221; in science, one means some sort of prediction that is clear enough that one could prove it to be false.    <span style="color: #008000;">[This is standard Karl Popper stuff.  You make up a theory and then you test it by comparing it's prediction to reality.  If it disagrees with a good experiment, the theory is wrong, and you change it or throw it away.  If it agrees with the experiment, it lives to be tested another day.   If it passes many tests, wow, maybe it's actually correct!]</span>  Now, the problem is that there are no linguistic theories in this area, because no one has written anything that makes clear predictions that one can test.  If you don&#8217;t believe me, take a look at the <a title="Google Scholar" href="http://scholar.google.com" target="_blank">Google Scholar</a> search for <a title="phonetic implementation phonological theory predict" href="http://scholar.google.com/scholar?q=phonetic+implementation+phonological+theory+predict&amp;hl=en&amp;btnG=Search&amp;as_sdt=1%2C39&amp;as_sdtp=on" target="_blank">&#8220;phonetic implementation phonological theory predict&#8221;</a> and try to find a paper that makes a detailed prediction of which speech sounds are produced when, or any kind of detail about where the tongue goes.  You won&#8217;t.</li>
<li>You may find a few weak predictions along the lines of &#8220;the tongue will be raised&#8230;&#8221; but those don&#8217;t count for much because they have a 50% chance of being right by blind chance.  Another reason they are weak is because the &#8220;prediction&#8221; is usually derived just by trying to say something and feeling where the tongue goes; in science, that&#8217;s normally called &#8220;reporting the result of an experiment&#8221; rather than &#8220;predicting from a theory&#8221;.</li>
<li>To an extent, such theories exist in the form of speech synthesis and/or speech recognition systems.  While linguists don&#8217;t treat those systems as theories, nevertheless they make detailed predictions of how a given text should sound, even if it is a text that no one has ever said before.  However, while these systems are much more predictive than a typical linguistic paper, they are still not entirely satisfactory because they are all based on recorded speech.   That means, they are not predicting as much as one might hope: their output is just a reproduction of what some human said, but sliced up and glued back together.  Their strongest claims to theory-hood are the places where they cut, and the details of what speech segments they glue back together.   They say that &#8220;if we cut speech <em>here</em> and <em>here</em>, then we can glue <em>those</em> bits and we&#8217;ll get a whole new word.&#8221;</li>
</ul>
<p><strong>&#8220;&#8230;underlying phonological structure&#8230;&#8221;</strong>    <span style="color: #000080;">The word &#8220;underlying&#8221; is beautiful, because it subtly claims that phonology is the reality, and the tongue motions and sounds we hear are mere secondary phenomena.  But, that word has no right to be used: phonology could just as well be a convenient way of dividing up the sounds that we make, rather than something deeper and more important.   It might be just an artificial way of chopping the world up into convenient boxes.</span></p>
<ul>
<li>Is phonology underlying?  I.e. is it something that is more real and more stable than speech sounds?  That&#8217;s what &#8220;underlies&#8221; means in this context: for example electrons underlie the phenomena of electricity.  That means that electrons (and a few equations) are able to explain anything you might want to know about electricity.</li>
<li>If phonology were really &#8220;underlying&#8221; in that sense, the evidence for it [hopefully experiments!] would be triumphantly taught in undergraduate courses.  These would be basic experiments that would support the entire field of Linguistics.  But, the evidence that phonology underlies behaviour isn&#8217;t taught, because (to the extent it exists at all) it is ambiguous.</li>
<li>If phonology were really proven to be &#8220;underlying&#8221;, people who wrote the phrase in their academic papers would reference it.  <span style="color: #008000;">[Unless it were so well and broadly known (like Newtonian Mechanics) that it was taught in undergraduate courses.]</span>    Let&#8217;s <a href="http://scholar.google.com/scholar?q=%22underlying+phonological+structure%22&amp;hl=en&amp;btnG=Search&amp;as_sdt=1%2C39&amp;as_sdtp=on" target="_blank">search</a> for it.   As of today, I get the following hits:<br />
* I. Y. Liberman, <a href="http://dx.doi.org/10.1177/074193258500600604" target="_blank">doi: 10.1177/074193258500600604</a> .  No reference.<br />
* I. Y. Liberman, <a href="http://books.google.com/books?id=-b7enpRk6SQC&amp;lpg=PA3&amp;ots=iNjLNPrnp6&amp;dq=%22underlying%20phonological%20structure%22&amp;lr&amp;pg=PA3#v=onepage&amp;q=%22underlying%20phonological%20structure%22&amp;f=false" target="_blank">book</a>.  No reference<br />
* Bert Vaux, <a href="http://dx.doi.org/10.1162/002438998553833" target="_blank">doi:10.1162/002438998553833</a>.  No reference.<br />
* A. Lahiri, <a id="ddDoi" href="http://dx.doi.org/10.1016/0010-0277(91)90008-R" target="doilink">http://dx.doi.org/10.1016/0010-0277(91)90008-R</a>.  No reference.<br />
* et cetera, through 50 references at least.<br />
Everyone writes as if their reader knows what &#8220;underlying phonological structure&#8221; is, but no one points to evidence.  Of course, they&#8217;ve all read each other&#8217;s papers, and they&#8217;ve gotten used to reading that phrase.</li>
<li>In the real world, phonology is always derived by observing speech.  One cannot know what words someone is saying unless one listens to the speech and then deduces what the phonology ought to have been.    This procedure is a better match to &#8220;phonology is a convenient way of categorizing speech&#8221; than to &#8220;phonology underlies and explains speech&#8221;.</li>
<li>Given the lack of proof that phonology underlies anything, it might be inappropriate to use this phrase.</li>
</ul>
<p>Then after that rough start, the abstract improves.    The author is careful to say &#8220;theories see&#8221; instead of &#8220;theories show&#8221; and then later &#8220;Such assumption&#8230;&#8221;   The author does indeed know that this stuff is just assumed, rather than known.  Because so many of the basic assumptions of linguistics are not solidly established, it&#8217;s very hard to write carefully and still make an abstract sound impressively linguistic.</p>
<p>&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://kochanski.org/blog/?feed=rss2&#038;p=711</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>What&#8217;s wrong with this experiment?</title>
		<link>http://kochanski.org/blog/?p=703</link>
		<comments>http://kochanski.org/blog/?p=703#comments</comments>
		<pubDate>Sat, 24 Mar 2012 22:00:04 +0000</pubDate>
		<dc:creator>gpk</dc:creator>
				<category><![CDATA[academics]]></category>
		<category><![CDATA[ethics]]></category>
		<category><![CDATA[science and how it works]]></category>
		<category><![CDATA[techniques]]></category>
		<category><![CDATA[wild ideas]]></category>
		<category><![CDATA[writing]]></category>
		<category><![CDATA[biology]]></category>
		<category><![CDATA[communication]]></category>
		<category><![CDATA[conference]]></category>
		<category><![CDATA[data analysis]]></category>
		<category><![CDATA[experimental design]]></category>
		<category><![CDATA[exploration]]></category>
		<category><![CDATA[funding]]></category>
		<category><![CDATA[impact]]></category>
		<category><![CDATA[poster]]></category>
		<category><![CDATA[research]]></category>
		<category><![CDATA[science]]></category>
		<category><![CDATA[toxicology]]></category>

		<guid isPermaLink="false">http://kochanski.org/blog/?p=703</guid>
		<description><![CDATA[[ Note: the image is copyright http://jerryabuan.zenfolio.com/.  I'm grateful for it.] Actually, it looks pretty good: it proves its point fairly clearly.   Ordinarily, I&#8217;d say that there ought to be a bigger control group, but I think it&#8217;s reasonable to believe that the domestic teddy bear reproduces reliably under laboratory conditions.   So having one [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://i.imgur.com/wKcfJ.jpg"><img class="alignnone" title="Teratogenic Effects of Pure Evil in Ursus Teddius Domesticus" src="http://i.imgur.com/wKcfJ.jpg" alt="Teddy bears with three legs.  Mock scientific poster." width="518" height="346" /></a></p>
<p>[ Note: the image is copyright <a href="http://jerryabuan.zenfolio.com/">http://jerryabuan.zenfolio.com/</a>.  I'm grateful for it.]</p>
<p>Actually, it looks pretty good: it proves its point fairly clearly.   Ordinarily, I&#8217;d say that there ought to be a bigger control group, but I think it&#8217;s reasonable to believe that the domestic teddy bear reproduces reliably under laboratory conditions.   So having one control bear is a bit weak, but probably OK, especially given the magnitude of the effects we see when Pure Evil is introduced.</p>
<ul>
<li>There&#8217;s plenty of room for a brief mention of some of the other literature on Pure Evil.</li>
<li>The data at 1000ppm should be supported by some evidence other than a note.    Perhaps a photo of the cage bars after the teddy bear chewed through them?</li>
<li>There is nothing presented to make the reader believe that the purple stuff in the 700 ppm offspring is from a different species.   Quite possibly, it is merely mutated teddy bear tissue.</li>
<li>There should be notes that the experiment and the treatment of the experimental subjects was approved by an ethics committee.</li>
<li>Normally, one would expect to see an acknowledgement for the source of the experimental funding.</li>
<li>If the two postdocs who were killed in the experiment contributed to the research, then they should be co-authors.   Of course, if they had just been standing there, authorship would not be appropriate (and I&#8217;m not sure whether an acknowledgement would be in good taste or not&#8230;).</li>
<li>It would have been good to have taken tissue samples of the 1000 ppm parent after it was euthanized.  Possibly some light could have been thrown on the biochemical mechanisms of ocular luminescence.   But, perhaps that&#8217;s the subject of a separate paper.</li>
<li>It would be good to have a laboratory analysis of the Pure Evil.    What if it weren&#8217;t pure?   After all, materials recovered from exploded toasters are often contaminated with bread crumbs, melted plastic, or plaster dust.</li>
</ul>
<p>&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://kochanski.org/blog/?feed=rss2&#038;p=703</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Stanford Encyclopaedia of Philosophy (Sounds)</title>
		<link>http://kochanski.org/blog/?p=618</link>
		<comments>http://kochanski.org/blog/?p=618#comments</comments>
		<pubDate>Fri, 23 Mar 2012 02:41:56 +0000</pubDate>
		<dc:creator>gpk</dc:creator>
				<category><![CDATA[academics]]></category>
		<category><![CDATA[brains]]></category>
		<category><![CDATA[language]]></category>
		<category><![CDATA[scams]]></category>
		<category><![CDATA[science and how it works]]></category>
		<category><![CDATA[writing]]></category>
		<category><![CDATA[acoustics]]></category>
		<category><![CDATA[answering-the-wrong-question]]></category>
		<category><![CDATA[brain]]></category>
		<category><![CDATA[communication]]></category>
		<category><![CDATA[errors]]></category>
		<category><![CDATA[Language]]></category>
		<category><![CDATA[noise]]></category>
		<category><![CDATA[philosophy]]></category>
		<category><![CDATA[self-delusion]]></category>
		<category><![CDATA[sound]]></category>
		<category><![CDATA[who-watches-the-watchers]]></category>

		<guid isPermaLink="false">http://kochanski.org/blog/?p=618</guid>
		<description><![CDATA[The Stanford Encyclopaedia of Philosophy is an interesting and sophisticated resource.  But today, I happened to google into its article on &#8220;sounds&#8220;, by Roberto Casati &#60;casati@ehess.fr&#62; and Jerome Dokic &#60;Jerome.Dokic@ehess.fr&#62; and I became dismayed.  It seems to ask the wrong questions and discuss things that just useless and wrong-headed.  Here&#8217;s where the article lays out [...]]]></description>
			<content:encoded><![CDATA[<p>The <a href="http://plato.stanford.edu">Stanford Encyclopaedia of Philosophy</a> is an interesting and sophisticated resource.  But today, I happened to google into its article on &#8220;<a href="http://plato.stanford.edu/entries/sounds/">sounds</a>&#8220;, by <a href="http://roberto.casati.free.fr/casati/roberto.htm" target="other">Roberto Casati</a> &lt;<a href="mailto:casati%40ehess%2efr"><em>casati<abbr title=" at ">@</abbr>ehess<abbr title=" dot ">.</abbr>fr</em></a>&gt; and <a href="http://www.institutnicod.org/" target="other">Jerome Dokic</a> &lt;<a href="mailto:Jerome%2eDokic%40ehess%2efr"><em>Jerome<abbr title=" dot ">.</abbr>Dokic<abbr title=" at ">@</abbr>ehess<abbr title=" dot ">.</abbr>fr</em></a>&gt; and I became dismayed.  It seems to ask the wrong questions and discuss things that just useless and wrong-headed.  Here&#8217;s where the article lays out what it thinks are the philosophically important questions about sounds:</p>
<p style="padding-left: 30px;"><span style="color: #0000ff;">The main issues which are on the table concern the nature of sounds. Sounds enter the content of auditory perception. But what are they? Are sounds individuals? Are they events? Are they properties of sounding objects? If they are events, what type of event are they? What is the relation between sounds and sounding objects? Temporal and causal features of sounds will be important in deciding these and related questions. However, it turns out that a fruitful way to organize these issues deals with the spatial properties of sounds.</span></p>
<p style="padding-left: 30px;"><span style="color: #0000ff;">Indeed, the various philosophical pronouncements about the nature of sounds can be rather neatly classified according to the spatial status each of them assigns to sounds. Where are sounds? Are they anywhere?</span></p>
<p><span style="color: #0000ff;"><span style="color: #000000;">None of these questions make much sense in the real world.  As a physicist and speech researcher, I know what goes on in the world shortly before we perceive a sound:</span></span></p>
<ol>
<li><span style="color: #0000ff;"><span style="color: #000000;">Something vibrates.</span></span></li>
<li><span style="color: #0000ff;"><span style="color: #000000;">It causes the air near it to vibrate.</span></span></li>
<li><span style="color: #0000ff;"><span style="color: #000000;">The vibration travels through the air as a pressure wave.  The wave spreads out in all directions.<br />
</span></span></li>
<li><span style="color: #0000ff;"><span style="color: #000000;">The pressure wave is modified by all objects it passes.   Some things reflect parts of the wave, some things absorb parts.</span></span></li>
<li><span style="color: #0000ff;"><span style="color: #000000;">The wave gets to your ear, and it makes your ear drum vibrate.  That causes the oval window of the <a href="https://secure.wikimedia.org/wikipedia/en/wiki/Cochlea">cochlea</a> to vibrate, which makes the fluid inside vibrate.</span></span></li>
<li><span style="color: #0000ff;"><span style="color: #000000;">Hair cells are triggered by the vibrations and send nerve impulses up the auditory nerve to the <a href="http://en.wikipedia.org/wiki/Primary_auditory_cortex">auditory cortex</a>.</span></span></li>
<li><span style="color: #0000ff;"><span style="color: #000000;">It gets complicated.</span></span></li>
</ol>
<p>Sounds (whatever the philosophers think they are) are the end-result of a complex process that is spread over space and time.  For instance, as I hear a car drive by, my perception is caused by:</p>
<ol>
<li>Rubber tires unsticking from the rough road surface and bouncing back into shape.   These sounds are, of course, made by four tires simultaneously.</li>
<li>The sound waves bounce around under the car, bounce off the road, and perhaps other cars near by.</li>
<li>The sound wave diffracts over the top of the noise barrier at the side of the highway, and some of it bends down towards my window.</li>
<li>It sneaks through the window somehow, leaking through gaps, vibrating the glass, and mostly bouncing back.</li>
<li>Then it gets to my ear and various perceptual things happen.</li>
</ol>
<p>So, in making that sound, there are kilograms of rubber, a ton of fast-moving car, maybe 300kg of air, two windowpanes, and my head.  It&#8217;s a complicated process that involves a lot of stuff.  Notably, if any of that stuff changes, the sound will change slightly.  Change the car, the windowpanes, the noise barrier, anything.</p>
<p>So we can think about the philosophical questions they ask:</p>
<ul>
<li>A<span style="color: #0000ff;">re sounds individuals?</span> Depends on exactly what they mean, but the answer isn&#8217;t likely to be interesting.  If they mean &#8220;individual&#8221; = &#8220;isolated thing&#8221;, the sound of the car tires isn&#8217;t very isolated from other sounds.  Certainly the sounds from the four tires aren&#8217;t very isolated from each other.   If they mean &#8220;individual&#8221; as &#8220;unique, never duplicated&#8221;, then maybe, if we could somehow split the noise we hear into separate sounds, then no two of them would be exactly alike.  (If one can measure precisely enough.)   But, the process of splitting is pretty artificial.</li>
<li><span style="color: #0000ff;">Are they events?</span> By &#8220;event&#8221;, they seem to mean something that has a fairly definite location and time.  That description is spread over many cubic meters of space, and it is only localized in time if you artificially chop the sound into individual moments.</li>
<li><span style="color: #0000ff;">Are they properties of sounding objects?</span> What you hear is a property of the sounding object, the environment, and you.  It&#8217;s not <em>just</em> a property of the sounding object.</li>
<li><span style="color: #0000ff;">What is the relation between sounds and sounding objects?</span> Part of it is called acoustics.  Most of the time, acoustics gives you a complicated mathematical relationship between the sounding object and the motion of your ears, and it is not something that is easily expressible in words.  Trying to express it in terms of words (unless one wishes to be approximate) is a mistake.  This is not a question for philosophers any more &#8212; the physicists took it over in the mid 1800s.</li>
<li><span style="color: #0000ff;">Where are sounds?  Are they anywhere? </span> These are simply silly questions.  Various parts of the process that produces the sound occur in different places.  There is no single place.  It&#8217;s very much like asking &#8220;Where is Julius Caesar now?&#8221;  He&#8217;s not in any one region any more: his atoms are scattered and blow with the breezes.</li>
</ul>
<p>And, then the article sets off to answer these questions.  Not surprisingly, it doesn&#8217;t get much of anywhere.   I&#8217;d be embarrassed to have written that article; it was written from the narrow perspective of people who play games with words in order to score points against other philosophers.  It&#8217;s not really there to help you understand anything.   Or, if it was, they forgot to read all the acoustics literature that&#8217;s accumulated in the last century or two.</p>
]]></content:encoded>
			<wfw:commentRss>http://kochanski.org/blog/?feed=rss2&#038;p=618</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The Dark Side of the Fibonacci Sequence</title>
		<link>http://kochanski.org/blog/?p=693</link>
		<comments>http://kochanski.org/blog/?p=693#comments</comments>
		<pubDate>Sat, 17 Mar 2012 01:06:39 +0000</pubDate>
		<dc:creator>gpk</dc:creator>
				<category><![CDATA[math]]></category>
		<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[equation]]></category>
		<category><![CDATA[Fibonacci]]></category>
		<category><![CDATA[Golden mean]]></category>
		<category><![CDATA[Golden section]]></category>
		<category><![CDATA[phi]]></category>
		<category><![CDATA[prediction]]></category>
		<category><![CDATA[recursion relationship]]></category>
		<category><![CDATA[sequence]]></category>
		<category><![CDATA[teaching]]></category>

		<guid isPermaLink="false">http://kochanski.org/blog/?p=693</guid>
		<description><![CDATA[This is the defining equation for the Fibonacci sequence: F[n] = F[n-1] + F[n-2]       (See A, B, C, D, E, F, G.) 1.000000000 -0.6180339887498949 0.3819660112501051                  ( 1.000 + -0.618&#8230; = 0.381&#8230; ) -0.2360679774997898               ( -0.618&#8230; + 0.381&#8230; = -0.236&#8230; ) [...]]]></description>
			<content:encoded><![CDATA[<p>This is the defining equation for the Fibonacci sequence:</p>
<p>F[n] = F[n-1] + F[n-2]       (See <a href="http://en.wikipedia.org/wiki/Fibonacci_number">A</a>, <a href="http://mathworld.wolfram.com/FibonacciNumber.html">B</a>, <a href="http://www.maths.surrey.ac.uk/hosted-sites/R.Knott/Fibonacci/fib.html">C</a>, <a href="http://planetmath.org/encyclopedia/ListOfFibonacciNumbers.html">D</a>, <a href="http://plus.maths.org/content/os/issue3/fibonacci/index">E</a>, <a href="http://www.textism.com/bucket/fib.html">F</a>, <a href="http://www.branta.connectfree.co.uk/fibonacci.htm">G</a>.)</p>
<p><span style="color: #0000ff;">1.000000000</span></p>
<p><span style="color: #0000ff;">-0.6180339887498949</span></p>
<p><span style="color: #0000ff;">0.3819660112501051                 <span style="color: #008000;"> ( 1.000 + -0.618&#8230; = 0.381&#8230; )</span></span></p>
<p><span style="color: #0000ff;">-0.2360679774997898             <span style="color: #008000;">  ( -0.618&#8230; + 0.381&#8230; = -0.236&#8230; )</span></span></p>
<p><span style="color: #0000ff;">0.1458980337503153              <span style="color: #008000;">   ( 0.381&#8230; + -0.236&#8230; = 0.145&#8230; )</span></span></p>
<p><span style="color: #0000ff;">-0.09016994374947451           <span style="color: #008000;">  ( -0.236&#8230; + 0.145&#8230; = -0.090&#8230; )</span></span></p>
<p><span style="color: #0000ff;">&#8230;..</span></p>
<p>&nbsp;</p>
<p>Why does it get smaller instead of larger?</p>
]]></content:encoded>
			<wfw:commentRss>http://kochanski.org/blog/?feed=rss2&#038;p=693</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>&#8220;Filthy Lucre&#8221; and International Trade</title>
		<link>http://kochanski.org/blog/?p=530</link>
		<comments>http://kochanski.org/blog/?p=530#comments</comments>
		<pubDate>Mon, 12 Mar 2012 03:46:28 +0000</pubDate>
		<dc:creator>gpk</dc:creator>
				<category><![CDATA[economics]]></category>
		<category><![CDATA[politics]]></category>
		<category><![CDATA[accounting]]></category>
		<category><![CDATA[comparative advantage]]></category>
		<category><![CDATA[competitiveness]]></category>
		<category><![CDATA[currency exchange]]></category>
		<category><![CDATA[exchange]]></category>
		<category><![CDATA[finances]]></category>
		<category><![CDATA[foreign trade]]></category>
		<category><![CDATA[free trade]]></category>
		<category><![CDATA[globalization]]></category>
		<category><![CDATA[manufacturing]]></category>
		<category><![CDATA[money]]></category>
		<category><![CDATA[prediction]]></category>
		<category><![CDATA[ricardo]]></category>
		<category><![CDATA[trade]]></category>

		<guid isPermaLink="false">http://kochanski.org/blog/?p=530</guid>
		<description><![CDATA[&#8220;Filthy Lucre&#8221; by Joseph Heath is quite a good book.[1, 2]   It takes a careful, logical look at a lot of economics and social policy, and manages to do it without much political/idealogical bias.  ["Unbiassed" is a bit of a fuzzy term, but it gives credit to all sides, when credit is due.]  In [...]]]></description>
			<content:encoded><![CDATA[<p>&#8220;Filthy Lucre&#8221; by <a href="http://homes.chass.utoronto.ca/~jheath/" target="_blank">Joseph Heath</a> is quite a good book.[<a href="http://worthwhile.typepad.com/worthwhile_canadian_initi/2009/05/a-rambling-post-on-joseph-heaths-filthy-lucre.html" target="_blank">1</a>, <a href="http://www.quebecoislibre.org/09/090615-3.htm" target="_blank">2</a>]   It takes a careful, logical look at a lot of economics and social policy, and manages to do it without much political/idealogical bias.  <span style="color: #008000;">["Unbiassed" is a bit of a fuzzy term, but it gives credit to all sides, when credit is due.]</span>  In fact, the book takes pride in shooting down some bad ideas from both ends of the political spectrum.  It&#8217;s clearly argued and pleasant to read.  The only real problem with the book is the title, which sounds wildly leftist, far more than the book itself. <span style="color: #008000;"> [The world would be a better place if more people read this book; it helps one think about economics.  FYI, I saw and appreciated a bumper sticker yesterday: "<a title="Critical Thinking: The Other National Deficit" href="http://politicalhumor.about.com/od/Rally-to-Restore-Sanity/ig/Funniest-Signs-Rally-to-Restore-Sanity/Critical-Thinking-Deficit.htm" target="_blank"><span style="color: #008000;">Critical Thinking: The Other National Deficit</span></a>".]</span></p>
<p>But there&#8217;s one place where it falls down: Chapter 5, &#8220;Uncompetitive in Everything&#8221;.  This is an analysis of international trade, and Heath attempts to show that globalization does not force your country to become more competitive.  He makes the claim that you cannot be uncompetitive in everything.</p>
<p>Heath uses an example of two bakeries: A &#8220;Rich Side&#8221; bakery that is especially good at making bagels, but relatively inefficient at making tarts, and a &#8220;Poor Side&#8221; bakery that is good at tarts and poor at bagels.   The bakeries are in different countries, so they do <em>not</em> share a common currency.  He goes on to show that &#8212; even if the labor costs are much higher on the rich side &#8212; both sides are better off if they trade bagels for tarts.</p>
<p>The essential argument is this:  suppose the rich side makes bagels for £1 and tarts for £2.   Suppose the poor side makes bagels for ¥2 and tarts for ¥1.  <span style="color: #008000;">[I'm not saying Brits are rich and Japanese are poor here: £ and ¥ are just two convenient, different currency symbols that people will recognize.]</span>  Now, someone on the poor side has two ways to get a bagel.   One way is that they can make it locally, for ¥2.   The other way is that they can make a tart (for ¥1) and sell it to someone on the rich side.  If international trade is rare, they&#8217;ll get a price near £2, and they can then go over to the rich side and buy two bagels with that amount of money.      So, with international trade, they could end up with two bagels for half the price.     And, of course, the same logic applies to both sides.     Trade benefits both countries.  This isn&#8217;t a new idea.  It comes from David Ricardo, around 1817.  [<a href="http://en.wikipedia.org/wiki/Comparative_advantage" target="_blank">1</a>, <a href="http://www.youtube.com/watch?v=Pd_qs8ueIWw&amp;feature=related" target="_blank">2</a>, <a href="http://www.econlib.org/library/Topics/Details/comparativeadvantage.html" target="_blank">3</a>]</p>
<p>The arguments presented are absolutely correct, as far as they go.  But the model presented is just too simple to capture a couple of important effects in the real world.   Two goods is not enough.  The trouble is, there need to be at least two other goods in the model:</p>
<ol>
<li>Goods that you cannot produce at all in your country.    Foreign travel is one example of these.   If you live in Alberta (or Yorkshire), you cannot produce a trip to Paris no matter how hard you try.  You can only do it in France.   Another example is stuff that requires huge capital or intellectual investments, like semiconductor fabrication plants.   Building anything close to state-of-the-art chips takes multi-billion dollar investments and lots of specially educated people, and most nations will not have the capability within their borders.</li>
<li>Goods that you really <em>don&#8217;t</em> want to sell, on moral/ethical grounds.     Such as the slave trade, sex tourism, one&#8217;s kidney, or the right for other nations to dump their waste on your ground.   And, there are a whole bunch of milder variants of #2 that one would also prefer to avoid:  for instance, &#8220;Ship us your recycling, we&#8217;ll sort it and send it back.&#8221; &#8212; honest, but dirty jobs.</li>
</ol>
<p>Now, the trouble is, there is a certain insatiable demand for #1.  Everyone wants to do a little travel, and the political elite will arrange to do some, by fair means or foul.   Or, the elite will send their kids to expensive private schools in England.   And, you can bet that whomever has money and power will have new PCs and iPads or whatever nearby.   This demand is not very elastic, and will never go quite to zero no matter what the exchange rate is.</p>
<p>Therefore, if conventional trade becomes sufficiently uncompetitive, the outflow of money for #1 will end up being balanced against category #2.    Therefore, it <em>does</em> matter (to a degree) how competitive you are.   You don&#8217;t want your industries to be less valuable to outsiders than things in category #2.</p>
<p><span style="color: #008000;">[Oh, and there are other ways in which Ricardo's analysis doesn't quite represent reality.   It assumes that the bagels produced on the rich side and the poor side are completely indistinguishable (which might easily be false).  It also assumes that trade is free, the costs of trade are small, that prices are reasonably stable so that the risk of the price changing badly between the time you decide to bake tarts and the time you finally end up buying bagels.   All these simple models are<span style="color: #99cc00;"> [of course]</span> simplifications of the real world.]</span></p>
]]></content:encoded>
			<wfw:commentRss>http://kochanski.org/blog/?feed=rss2&#038;p=530</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Post-thesis anti-chronological advice</title>
		<link>http://kochanski.org/blog/?p=678</link>
		<comments>http://kochanski.org/blog/?p=678#comments</comments>
		<pubDate>Tue, 30 Aug 2011 08:30:02 +0000</pubDate>
		<dc:creator>Paradis</dc:creator>
				<category><![CDATA[academics]]></category>
		<category><![CDATA[science and how it works]]></category>
		<category><![CDATA[techniques]]></category>
		<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[advice]]></category>
		<category><![CDATA[code]]></category>
		<category><![CDATA[communication]]></category>
		<category><![CDATA[confusion]]></category>
		<category><![CDATA[data analysis]]></category>
		<category><![CDATA[digital humanities]]></category>
		<category><![CDATA[errors]]></category>
		<category><![CDATA[Language]]></category>
		<category><![CDATA[past]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[research]]></category>
		<category><![CDATA[science]]></category>
		<category><![CDATA[software]]></category>
		<category><![CDATA[teaching]]></category>
		<category><![CDATA[writing]]></category>

		<guid isPermaLink="false">http://kochanski.org/blog/?p=678</guid>
		<description><![CDATA[I am mailing the following letter back in time to myself in the summer of 2005. That summer, I began the research that would ultimately become my DPhil in computational linguistics. Dear Michel, I hope you are enjoying your summer. The sun is bright, the weather warm and the beach as good place as any [...]]]></description>
			<content:encoded><![CDATA[<p><span style="color: #008000;">I am mailing the following letter back in time to myself in the summer of 2005. That summer, I began the research that would ultimately become my DPhil in computational linguistics.</span></p>
<p>Dear Michel,</p>
<p>I hope you are enjoying your summer. The sun is bright, the weather warm and the beach as good place as any to learn how to code in Java. Six years hence, I am the last person to begrudge you this sunny start to research that will take about three years longer than you think it will. So enjoy.</p>
<p>But let me pass on a few pieces of advice that can all harken back to an old military saying – “Amateurs talk strategy; professionals talk logistics.” The big ideas are important, but in the end the hardest parts of what you will do is refining those big ideas into testable hypotheses that you then can actually test. Once you do that, it turns out you will have proof that you are up to this DPhil and have something worth submitting. Until then, you are an amateur.</p>
<p>Don’t get me wrong. Right now, you are bursting with ideas for how to answer the most interesting questions at the frontiers of computational linguistics. And that is a very good thing. Those ideas will keep going through the cruel days it will take you to fully understand how to compute an Eigenvalue Decomposition or a G-Function. Enjoy them. Let them turn over in your mind as you have that second beer.</p>
<p>That said, please, for my sake, write them down –And not on the window in your room. You are going to move more than once over the next six years and the landlord just might come in when you are not home and clean off the mess. To be sure, the best of what you come up with will come back to you at some point, at least as far as I remember. But still, keep track of your ideas, your research and your sources. Maybe start a journal with dates and short, legible explanations. Six years from now when you are editing your bibliography in BibTeX (it’s a programming language used for copyediting, don’t worry about it now), you will thank me. It turns out that (Brown 2000) is kind of ambiguous as citations go and hunting them down all over again made me very mad at you for being so lazy.</p>
<p>Another thing to keep in mind is that you should not underestimate your capacity needs. Too much time and energy will be spent trying to fit what you want to do in the resources you have at you deposal. It is easier (and if you value your time cheaper) to just get more resources. Right now, you are coding on that trusty laptop you bought before you started school. To be sure, it will serve you surprisingly well. You will find that it will be up to the data processing needed for your MPhil thesis, but you will have to put it on its side on the floor with a desk fan at its back to prevent it from overheating.</p>
<p>Ultimately, you will build a small city of towers. ‘Towers?’, your thinking, ‘Isn’t that a little excessive?’ No. It’s not. You will get a number of them over the next few years. One will explode – loudly – and fill your apartment with the smell of burnt metal. Ultimately you will have about four sometimes five, even six on the bad days, running at any given time.</p>
<p>Which leads me to a very important point; do not underestimate your ability to confuse yourself. Four to six computers running at the same time is surprisingly complex to manage. The obvious answer will be trying to figure out a way of networking them together and threading your code. Don’t waste your time. This is exactly what I mean about not confusing yourself. Your thesis is in computational linguistics. You <em>use</em> computer science; you don’t <em>study</em> it. If you want to inject an unholy amount of complexity and confusion into the next six years of your life, network your computers.</p>
<p>Or you could do what you will inevitably do anyway. Just copy the essential bits of your data five or six times and network them via your sneakers, flash drives and GMail. Simple, dirty, cheap, but it works.</p>
<p>If you are worried that doing this means that you are not sufficiently challenging yourself, don’t be. Even this will continue to surprise you with opportunities to confuse yourself. At 330AM, when you have searched every line of code to find where you are dividing by zero because it seems infinites have apparently recruited all of the numbers in your results into their Borg, you will realize you are actually looking at code you wrote three months ago on a different machine. You can formalize this into a convenient equation: Self = Confusion/0.</p>
<p>This is how you will get to Sesame Street. Yes, sunny days, chasing the clouds away Sesame Street. High-resolution pictures of everyone in the neighborhood are freely available on Google Images and as desktop wallpaper, they will give a face to each of the computers you will have churning away over the next few years. That way when you finally get your act together and begin a proper coding log, you can amuse yourself with entries like, “Bert exploded after the power source apparently overheated, hard drive recovered.”</p>
<p>The hilarity never ends and some people, like your future wife, will argue that it never started. To be sure this turning your computers into Sesame Street characters is evidence of infantile regression and perhaps some light madness. But if there is one thing that is true about writing a DPhil, particularly one that demands extensive computational work, it is that it is a long, lonely process. More than once you will ask yourself why you are doing this at all, especially after you take that job in a few years and the thesis fills each and every one of your nights, weekends and days off.</p>
<p>But you will find as you are looking at Ernie at 215AM on a Saturday night with ‘Rubber Ducky’ stuck in your head as you again try to optimize and debug your code that you have actually come a long way. You have gotten efficient at coding. You have actually learned statistics and a fair bit of upper level linear algebra. Math is no longer voodoo to you, which is an accomplishment in itself. Indeed, when the lawyers at work ask you about your thesis and you begin to explain it, they will squint at you with the suspicion they would use if they suspected you were concealing a prior felony conviction. (Oh, and by the way, don’t tell them the Sesame Street thing. That makes the squinting much harder.)</p>
<p>The last piece of advice is to be grateful for all of those (real people) around you. When your advisor tells you that a certain experiment or data are crap, believe him and move on, he’s right. Take your future wife on plenty of dates. She will be surprisingly understanding of your need to spend all of your free time on Sesame Street, the SATA cables and menagerie of components with which you decorate the apartment and the occasional chorus of expletives from the other room that wake her in the middle of the night. And be grateful for your future self, who was kind enough to write you this letter. He has just saved you a lot of time and aggravation.</p>
<p>Your friend,</p>
<p>Michel</p>
]]></content:encoded>
			<wfw:commentRss>http://kochanski.org/blog/?feed=rss2&#038;p=678</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Ranking Universities</title>
		<link>http://kochanski.org/blog/?p=603</link>
		<comments>http://kochanski.org/blog/?p=603#comments</comments>
		<pubDate>Wed, 18 May 2011 19:44:19 +0000</pubDate>
		<dc:creator>gpk</dc:creator>
				<category><![CDATA[academics]]></category>
		<category><![CDATA[news media]]></category>
		<category><![CDATA[politics]]></category>
		<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[exploration]]></category>
		<category><![CDATA[funding]]></category>
		<category><![CDATA[opinions]]></category>
		<category><![CDATA[self-delusion]]></category>
		<category><![CDATA[self-fulfilling-prophecy]]></category>
		<category><![CDATA[soliton]]></category>
		<category><![CDATA[teaching]]></category>
		<category><![CDATA[who-watches-the-watchers]]></category>

		<guid isPermaLink="false">http://kochanski.org/blog/?p=603</guid>
		<description><![CDATA[Who is best?   Cambridge, Oxford, Harvard, MIT, Princeton, Duke, Imperial College, &#8230; ? My daughter applied to quite a number of universities this year, and I am feeling like an ill-informed consumer.  Not so much because I haven&#8217;t informed myself, but because the data I see is either of dubious relevance or follows a dubious [...]]]></description>
			<content:encoded><![CDATA[<p>Who is best?   Cambridge, Oxford, Harvard, MIT, Princeton, Duke, Imperial College, &#8230; ?</p>
<p>My daughter applied to quite a number of universities this year, and I am feeling like an ill-informed consumer.  Not so much because I haven&#8217;t informed myself, but because the data I see is either of dubious relevance or follows a dubious methodology.</p>
<p>Most of these educational rankings are built of three kinds of factors:</p>
<ol>
<li>Money.   <span style="color: #0000ff;">E.g. how much money comes in.</span></li>
<li>Counting people.   <span style="color: #0000ff;">E.g. student/teacher ratios.</span></li>
<li>Counting research results.  <span style="color: #0000ff;">E.g. published research papers and Nobel prizes.</span></li>
<li>Opinions.  <span style="color: #0000ff;">E.g. what people think about it.</span></li>
<li><span style="color: #0000ff;"><span style="color: #000000;">Admission statistics.</span> E.g. what fraction do they accept and what fraction take up the offer.<br />
</span></li>
</ol>
<p>The trouble is that none of these factors are particularly satisfactory::</p>
<ol>
<li>Money doesn&#8217;t directly educate people.</li>
<li>Smaller classes ought to be better, but when I read the research literature [e.g. <a href="http://eric.ed.gov/ERICWebPortal/detail?accno=ED471331 ">1</a>, <a href="http://www.avongrove.org/district/Newsroom/ClassSizeTaskForce/ArticlesResearch/ClassSizeDebate.pdf#page=16">2</a>, ] it doesn&#8217;t seem to make a big difference.  And, those studies are mostly done on primary schools, not universities, so who knows how relevant they are?   Universities also have the complication of a wide variety of class sizes, from single-person tutorials (e.g. Oxford) up to huge lectures (e.g. nearly everywhere).  Thus, you can have two universities with the same student/teacher ratio: one could have all mid-size classes and the other could have a mix of small classes and large lectures.</li>
<li>I am a firm believer that education should be done by people who have enough knowledge of the field to admit all the things we don&#8217;t know.  Researchers are good at that; people who do nothing but teach must surely be tempted to compromise and to teach what the students want to hear.  In my experience, students want short, simple, complete answers and textbooks often supply such answers even if the truth is messier (for example, see <a title="Textbooks said it was pitch, but it wasn't." href="http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.74.4351" target="_blank">here</a>).  <strong>BUT</strong>, research isn&#8217;t the same thing as teaching, so it can&#8217;t be more than a correlation.</li>
<li>Everyone has opinions, but what are they based on?  Ideally, you&#8217;d like to ask a lot of people who went undergraduate at X, graduate school at Y, and taught at Z, and then put <em>their</em> opinions together.  Nobody can really compare more than about three universities, not if you want an informed choice.  But what is usually done?  They ask deans to rate all 500 universities in the survey.  The Dean probably knows something, first-hand, about five, and he/she has good second-hand rumors from people at another dozen, perhaps.  Where do the rest of the ratings come from?  Thin air, or perhaps from reading last-year&#8217;s survey.</li>
<li>The acceptance rate just depends on how many people apply.   So, that&#8217;s just the opinion of parents.  The fraction of people who take up the offer is just the opinion of students who haven&#8217;t been to university yet, so that&#8217;s not particularly helpful.</li>
</ol>
<p>This was all made more obvious by moving from the US to the UK a few years ago.  Now, in 2011, I have pretty strong opinions about a lot of UK universities, many of which I had never even heard of when I lived in the States.  And, I had totally missed some pretty good places, like Warwick or Imperial College.  I did know that Durham was pretty good, a place that few other people in the US had heard about, but I only had a high opinion of it because I worked with a good grad-student who came from Durham.  Without that one chance encounter, Durham would have been a blank.</p>
<p>Now, I ask myself, &#8220;Where do my opinions about UK universities come from?&#8221;  Not from personal experience!  I&#8217;ve visited Cambridge once (Physics department), Edinburgh once (Speech technology and Linguistics), University College London twice (Nanotechnology and Linguistics), Birmingham University once (Computer Science), and that&#8217;s about it.  Hardly a uniform sample, and hardly a big one.  I&#8217;ve met a lot of professors from other UK universities at conferences and a fair number of students who have come to Oxford, but those are almost all speech scientists or linguists.  My only basis for an overall evaluation of universities (rather than their language and speech departments) are what I read on the web or in newspapers.</p>
<p>But you also pick these things up by talking to people.  Everyone else has opinions and expresses them with that shrug of the shoulders or that tone of voice.  But, do they have a better basis for their opinions than I do?  I doubt it.</p>
<p>So, where did we apply, and why?  <span style="color: #008000;">(But don&#8217;t take this bit seriously.)</span></p>
<ul>
<li>Cambridge, probably so she wouldn&#8217;t have to live at home.  <span style="color: #008000;">(NB: You can&#8217;t apply to both Cambridge and Oxford.)</span></li>
<li>Durham, because it&#8217;s a good place (at least based on one grad student) and it&#8217;s not in London.</li>
<li>MIT, probably because MIT students put a police car up on the Great Dome, years ago.  Also, a &#8220;weird, energetic&#8221; culture and a couple of good lectures during a campus visit.  And, both of her parents went there, years ago.</li>
<li>Stanford.  Palm trees, blue skies, an atmosphere of academics mixed with luxury, and a really enthusiastic tour guide.  Good lectures.</li>
<li>Johns Hopkins, despite the fact that a student there told us &#8220;I like it here, but if you can get into X, I&#8217;d go there instead.&#8221;  The <a title="April Fool's Prank" href="http://www.washingtonpost.com/wp-dyn/content/article/2010/04/01/AR2010040102179.html" target="_blank">April fool&#8217;s prank</a> certainly helped: it showed the administration had a sense of humor.   Good stuff on display in the physics lobby, but a negative was they wouldn&#8217;t let us to to a lecture.</li>
<li>Not Harvard, despite it&#8217;s high rank.  I think it was because the tour guides and campus visit were remarkably uninformative.  Too careful and cautious.  We visited some classes: an excellent history lecture and a mediocre statistics lecture.</li>
<li>Princeton.  Perhaps because of Hoagie Haven.  They would have let us into a lecture, but we didn&#8217;t realize it at first.</li>
<li>Duke.  Good place, near the family, good visit.</li>
<li>&#8230;and a few of other places.</li>
</ul>
<p>My daughter requires me to say that there were other, more rational reasons in addition to those above.  For instance, Durham (like Cambridge) offers Natural Sciences instead of Physics, which is attractive because she doesn&#8217;t want to focus down too tightly, too soon.  But some of the decision really does come down to the visit: the tour guides, the lectures, the stuff on the walls, and the way the students walk.</p>
<p>&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://kochanski.org/blog/?feed=rss2&#038;p=603</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Infinite elephants and other mathematical doodles.</title>
		<link>http://kochanski.org/blog/?p=665</link>
		<comments>http://kochanski.org/blog/?p=665#comments</comments>
		<pubDate>Thu, 21 Apr 2011 21:50:53 +0000</pubDate>
		<dc:creator>gpk</dc:creator>
				<category><![CDATA[math]]></category>
		<category><![CDATA[music]]></category>
		<category><![CDATA[teaching]]></category>
		<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[wild ideas]]></category>
		<category><![CDATA[communication]]></category>
		<category><![CDATA[doodle]]></category>
		<category><![CDATA[impact]]></category>
		<category><![CDATA[nifty]]></category>

		<guid isPermaLink="false">http://kochanski.org/blog/?p=665</guid>
		<description><![CDATA[Vi Hart has all kinds of neat videos, including how to draw infinite elephants.   They&#8217;re great fun to watch, and they&#8217;re real math.]]></description>
			<content:encoded><![CDATA[<p><a href="http://vihart.com">Vi Hart</a> has all kinds of neat videos, including how to draw <a href="http://vihart.com/doodling/infinity-elephants.mp4">infinite elephants</a>.   They&#8217;re great fun to watch, and they&#8217;re real math.</p>
]]></content:encoded>
			<wfw:commentRss>http://kochanski.org/blog/?feed=rss2&#038;p=665</wfw:commentRss>
		<slash:comments>0</slash:comments>
<enclosure url="http://vihart.com/doodling/infinity-elephants.mp4" length="22404115" type="video/mp4" />
		</item>
		<item>
		<title>Searching the Web with Google: Some Oddities and the Problem of Inflation</title>
		<link>http://kochanski.org/blog/?p=659</link>
		<comments>http://kochanski.org/blog/?p=659#comments</comments>
		<pubDate>Fri, 15 Apr 2011 09:05:35 +0000</pubDate>
		<dc:creator>Juzek</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[complex algorithms]]></category>
		<category><![CDATA[frequency counts]]></category>
		<category><![CDATA[Language]]></category>
		<category><![CDATA[language usa]]></category>
		<category><![CDATA[opaque code]]></category>
		<category><![CDATA[research]]></category>
		<category><![CDATA[statistics]]></category>
		<category><![CDATA[systematic errors]]></category>
		<category><![CDATA[testing]]></category>
		<category><![CDATA[who-watches-the-watchers]]></category>
		<category><![CDATA[words]]></category>

		<guid isPermaLink="false">http://kochanski.org/blog/?p=659</guid>
		<description><![CDATA[When working on one of my papers (for more, click here [1]), I had the opportunity to learn more about the nature of Google’s search engine; and in particular about the number of hits that it returns. The number of hits Google gives is just an estimate, a fact that Google themselves emphasises. For instance, [...]]]></description>
			<content:encoded><![CDATA[<p>When working on one of my papers (for more, click here [<a title="On the Concept of Grammaticality" href="http://users.ox.ac.uk/~wolf2801/projects/02.html" target="_blank">1</a>]), I had the opportunity to learn more about the nature of Google’s search engine;  and in particular about the number of hits that it returns.</p>
<p>The number of hits Google gives is just an estimate, a fact that Google themselves emphasises. For instance, you might have noticed that Google only gives three significant digits in their results (see Matt Cutts&#8217; comment here [<a title="Matt Cutts' Comment" href="http://www.businessinsider.com/deliberate-query-sabotage-or-merely-the-weimaraner-effect-2010-10#comment-4cc0a6a549e2aec36b0e0000" target="_blank">2</a>]). There is more, as Greg <a href="http://kochanski.org/blog/?p=606">pointed out</a>: see an entry by Prof Jean Véronis here [<a title="Veronis' Blog" href="http://blog.veronis.fr/2005/01/web-googles-counts-faked.html" target="_blank">3</a>] and another by Mark Liberman here [<a title="Liberman's Blog" href="http://itre.cis.upenn.edu/~myl/languagelog/archives/001837.html" target="_blank">4</a>].  These distortions are rather sophisticated and might not affect your search queries.</p>
<p style="padding-left: 30px"><span style="color: #008000">Incidentally, you might wonder why I’ve been using Google’s results for an academic paper. Primarily because there is simply no practical alternative. Second, I&#8217;m taking some precautions to help ensure that (to the extent that the counts are distorted) all my results are affected equally. Since I&#8217;m looking at the ratios between certain search  queries, any distortion that is common to both queries will not affect the results.</span></p>
<p style="padding-left: 30px"><span style="color: #008000">Second, I made   sure that I saw the number of hits deflate. Third, I assume that all my   search queries were equally distorted, which again makes the results  more viable.<br />
</span></p>
<p>But, despite the usefulness of the Google counts, there are limitations that might cause trouble. One oddity occurs in big searches: One can only access a tiny fraction of the results Google returns (e.g., when googling “Volkswagen”, one will get some 800-900 million results, but one will only be able to click through a couple of hundreds of them).</p>
<p>Another problem is this: Well into my research I’ve noticed that the  results seems to be inflated: Certain search queries will return thousands of hits at first, but as you browse through them, the number collapses. Try the following search parameters and see for yourself:</p>
<p style="padding-left: 30px"><span style="color: #0000ff">“{menace OR menaces OR menaced} {me OR you OR him OR her OR us OR  them} to” -secrecy &#8211; silence</span></p>
<p>(Note the double inverted commas and the negatives.) At first, Google gives some 40 000 hits for this query, but as you page through, the number collapses to nothing, really.</p>
<p>To my research, this was a serious blow, as I relied on the viability of  Google’s number. I’ve got in touch with Alex Chitu of the blog Google Operating System [<a title="GOS" href="http://googlesystem.blogspot.com/" target="_blank">5</a>] and he explained to me that the reason for this distortion lies in the syntactic complexity of the search queries: They use quotation marks, logical operators, and exclusions; which again makes Google’s estimates get coarser.</p>
<p>This might explain the extreme inflation (by a multiplier of about 200), but even simple search queries give results that are inflated to a certain degree. Try to google ‘Eierschalensollbruchstelle’ (from Eierschalensollbruchstellenverursacher, German for egg puncher; a small  kitchen utensil to punch a hole into an egg, so it doesn’t crack when  boiling) and you’ll get some 500 or 600 results. As you click through them, it deflates to well under 200 results. Or try the proper name ‘  “Adam Roseneck” ’ (it’s important to put it into inverted commas): it gives 500 &#8211; 600 hits, which deflate to well under 50.</p>
<p>Now, the point with ‘Eierschalensollbruchstelle’ and ‘Adam Roseneck’ is that they are not incredibly complex queries in terms of their syntactic structure, but there is still some significant inflation going on, viz. by the multiplier 4 and 10, respectively.</p>
<p>Further, when comparing my results from 2008 to results you’d get today, I’ve noticed that current Google results are completely off (maybe this  points to a possible change of Google’s search algorithm?). Yahoo’s 2011 results, however, remained nearly the same compared to 2008 (despite a slight, overall increase; which you’d expect from a rapidly growing internet).</p>
<p>All these issues call Google’s estimate into doubt. As Google say, their “results estimates are just that &#8212; estimates” (again here [<a title="Matt Cutts Again" href="http://www.businessinsider.com/deliberate-query-sabotage-or-merely-the-weimaraner-effect-2010-10#comment-4cc0a6a549e2aec36b0e0000" target="_blank">2</a>]).  It is for you to judge how reliable these estimates are.</p>
]]></content:encoded>
			<wfw:commentRss>http://kochanski.org/blog/?feed=rss2&#038;p=659</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

