Wednesday, February 17, 2016

What Scalia Was Truly Like

If you want to get a feel for what the late Supreme Court justice Antonin Scalia was like, you can do no better than to read this long interview from three years ago.

Some highlights: despite being so "brilliant", Scalia was unsure about the pronunciation of the word "ukase" and wasn't familiar with the term "tell" as applied to poker. I am neither a lawyer nor a poker player, but I knew both of these. And I'm not particularly bright.

Scalia also knew nothing about linguistics, if he thought "Words have meaning. And their meaning doesn’t change." That's an extremely naive view of language and meaning. In reality, the meaning of words is fuzzy and smooshed out. And meaning changes all the time. Compare our current understanding of "nubile" with the dictionary definition from a dictionary 50 years ago.

Scalia read the Wall Street Journal and the Moonie-controlled Washington Times, but stopped reading the Washington Post because it was "slanted and often nasty". He didn't read the New York Times at all. Talk about being unaware of your own biases!

Scalia believed that the "Devil" is a real person because it is Catholic dogma (and by implication, because one cannot be a Catholic without accepting all of Catholic dogma). That's exactly the kind of black-and-white extremist viewpoint it takes to be an originalist. He thought this being was occupied in getting people not to believe in the Christian god. And he liked The Screwtape Letters, easily the stupidest of C. S. Lewis's output (and that's saying something). Scalia justified his belief by saying "Many more intelligent people than you or me have believed in the Devil." Yeah, well, many more intelligent people than I believe in Scientology, Bigfoot, and alien abductions, but that isn't a good argument for them. He also said that the Devil's becoming cleverer was "the explanation for why there’s not demonic possession all over the place. That always puzzled me. What happened to the Devil, you know? He used to be all over the place." The other explanation -- that there is no Devil and demonic possession never happened (it was health conditions misinterpreted by an ignorant and superstitious populace) -- was too obviously correct for him to consider.

Scalia thought that the only two possible choices after his death were "I'll either be sublimely happy or terribly unhappy." The obvious correct choice -- namely that he would simply cease to be -- did not even enter his mind as a possibility.

Scalia thought he was "heroic" by not recusing himself in a case where he clearly should have recused himself.

Reading this interview I could only think: What an asshole! Good riddance.

Monday, February 15, 2016

My Scalia Experience

Now that Supreme Court justice Antonin Scalia has died, one can find tributes to him everywhere, even from some liberals. He is being lauded for his intelligence and for being a nice guy in person.

Well, my Scalia experience is different. First, he may have been extremely intelligent, but even intelligent people can have blind spots. For Scalia, one obvious blind spot was the theory of evolution. Not only did he not understand the status of the theory among scientists, as Stephen Gould famously pointed out, but he also recently used the figure "5000 years" as an estimate for the age of humanity, when the actual figure is more like 100,000 to 200,000 years.

And as for being a nice guy, I can only tell about my own experience. Sometime in the late-1980's (I think it was 1987) he came to give a speech at the University of Chicago when I was teaching there. At the end of the talk there was time for questions. I asked a question -- and I don't really remember what it was about -- and Scalia got all huffy. He said something like, "I don't think that's appropriate for me to answer. In fact, it was completely inappropriate for you to ask."

Well, it wasn't. It was something definitely appropriate and about constitutional law, even if I don't quite remember what I asked. What I remember was the contempt he expressed in his words and body language that anyone would dare ask.

So maybe it's true, as some have said, that he was a wonderful guy with a great sense of humor and enormous intelligence. All I can say as an outsider is, not in my experience.

Yet Another Dubious Journal

From a recent e-mail message I received:

Dear Dr. Jeffrey Shallit,

Greetings from Graphy Publications

We kindly invite you to join the editorial board for International Journal of Computer & Software Engineering

The journal aims to provide the most complete and reliable source of information on current developments in the field of computer & software engineering. The emphasis will be on publishing quality articles rapidly and making them freely available to researchers worldwide. The journal will be essential reading for scientists and researchers who wish to keep abreast of the latest developments in the field.

International Journal of computer & software engineering is an international open access journal using online automated Editorial Managing System of Graphy Publications for quality review process. For more details please go through below link.

Hope you accept our invitation and you are requested to send us your recent passport size photo (to be displayed on the Journal’s website), C.V, short biography (150 words) and key words of your research interests for our records.

We are keenly looking forward to receiving your positive response

Yours sincerely,

J. Hemant
Managing Editor
International Journal of Computer & Software Engineering
Graphy Publications

Any journal of "computer & software engineering" that invites me to be on the editorial board, when I don't work in either computer engineering or software engineering, is clearly not to be taken seriously. Other bad signs: random capitalization of invitation letter, failure to end sentences with the proper punctuation, and an editorial board filled with people I've never heard of. Not surprisingly, the publisher, "Graphy Publications", is on Beall's List of Potential, possible, or probable predatory scholarly open-access publishers.

Thursday, February 11, 2016

Reproducibility in Computer Science

There has been a lot of discussion lately about reproducibility in the sciences, especially the social sciences. The result that garnered the most attention was the Nosek study, where the authors tried to reproduce the results of 98 studies published in psychology journals. They found that they were able to reproduce only about 40% of the published results.

Now it's computer science's turn to go under the spotlight. I think this is good, for a number of reasons:

  1. In computer science there is a lot of emphasis placed on annual conferences, as opposed to refereed journal articles. Yes, these conferences are usually refereed, but the reports are generally done rather quickly and there is little time for revision. This emphasis has the unfortunate consequence that computer science papers are often written quite hastily, a week or less before the deadline, in order to make it into the "important" conferences of your area.

  2. These conferences are typically quite selective and accept only 10% to 30% of all submissions. So there is pressure to hype your results and sometimes to claim a little more than you actually got done. (You can rationalize it by saying you'll get it done by the time the conference presentation rolls around.)

    (In contrast, the big conferences in mathematics are often "take-anything" affairs. At the American Mathematical Society meetings, pretty much anyone can present a paper; they sometimes have a special session for the papers that are whispered to be junk or crackpot stuff. Little prestige is associated with conferences in mathematics; the main thing is to publish in journals, which have a longer time frame suitable for good preparation and reflection.)

  3. A lot of research in computer science, especially the "systems" area, seems pretty junky to me. It always amazes me that in some cases you can get a Ph.D. just for writing some code, or, even worse, just modifying a previous graduate student's code.

  4. Computer science is one of the areas where reproducibility should (in theory) be the easiest. Usually, no complicated lab setups or multimillion dollar equipment is needed. You don't need to recruit test subjects or pass through ethics reviews. All you have to do is compile something and run it!

  5. A lot of computer science research is done using public funds, and as a prerequisite for obtaining those funds, researchers agree to share their code and data with others. That kind of sharing should be routine in all the sciences.
Now my old friend and colleague Christian Collberg (who has one of the coolest web pages I've ever seen) has taken up the cudgel of reproducibility in computer science. In a paper to appear in the March 2016 issue of Communications of the ACM, Collberg and co-authors Todd Proebsting and Alex M. Warren relate their experiences in (1) trying to obtain the code described in papers and then (2) trying to compile and run it. They did not attempt to reproduce the results in papers, just the very basics of compiling and running. They did this for 402 (!) papers from recent issues of major conferences and journals.

The results are pretty sad. Many authors had e-mail addresses that failed (probably because they moved on to other institutions or left academia). Many simply did not reply to the request for code (in some cases Collberg filed freedom of information requests to try to get it). Of those that did reply, their code failed for a number of different reasons, like important files missing. Ultimately, only about a half of all papers had code that passed the very basic tests of compiling and running.

This is going to be a blockbuster result when it comes out next month. For a preview, you can look at a technical report describing their results. And don't forget to look at the appendices, where Collberg describes his ultimately unsuccessful attempt to get code for a system that interested him.

Now it's true that there are many reasons (which Collberg et al. detail) why this state of affairs exist. Many software papers are written by teams, including graduate students that come and go. Sometimes they are not adequately archived, and disk crashes can result in losses. Sometimes the current system has been greatly modified from what's in the paper, and nobody saved the old one. Sometimes systems ran under older operating systems but not the new ones. Sometimes code is "fragile" and not suitable for distribution without a great deal of extra work which the authors don't want to do.

So in their recommendations Collberg et al. don't demand that every such paper provide working code when it is submitted. Instead, they suggest a much more modest goal: that at the time of submission to conferences and journals, authors mention what the state of their code is. More precisely, they advocate that "every article be required to specify the level of reproducibility a reader or reviewer should expect". This information can include a permanent e-mail contact (probably of the senior researcher), a website from which the code can be downloaded (if that is envisioned), the degree to which the code is proprietary, availability of benchmarks, and so forth.

Collberg tells me that as a result of his paper, he is now "the most hated man in computer science". That is not the way it should be. His suggestions are well-thought-out and reasonable. They should be adopted right away.

P. S. Ironically, some folks at Brown are now attempting to reproduce Collberg's study. There are many that take issue with specific evaluations in the paper. I hope this doesn't detract from Collberg's recommendations.

Tuesday, February 09, 2016

More Silly Philosopher Tricks

Here's a review of four books about science in the New York Times. You already know the review is going to be shallow and uninformed because it is written not by a scientist or even a science writer, but by James Ryerson. Ryerson is more interested in philosophy and law than science; he has an undergraduate degree from Amherst, and apparently no advanced scientific training.

In the review he discusses a new book by James W. Jones entitled Can Science Explain Religion? and says,

"If presented with this argument, Jones imagines, we would surely make several objections: that the origin of a belief entails nothing about its truth or falsity (if you learn that the earth is round from your drunk uncle, that doesn’t mean it’s not)..."

Now I can't tell if this is Jones or Ryerson speaking, but either way it illustrates the difference between the way philosophers think and the way everyone else thinks. For normal people who live in a physical world, where conclusions are nearly always based on partial information, the origin of a belief does and should impact your evaluation of its truth.

For example, I am being perfectly reasonable when I have a priori doubts about anything that Ted Cruz says, because of his established record for lying: only 20% of his statements were evaluated as "true" or "mostly true". Is it logically possible that Cruz could tell the truth? Sure. It's also logically possible that monkeys could fly out of James Ryerson's ass, but I wouldn't be required to believe it if he said they did.

For non-philosophers, when we evaluate statements, things like a reputation for veracity of the speaker are important, as are evidence, the Dunning-Kruger effect, the funding of the person making the statement, and so forth. Logic alone does not rule in an uncertain world; in the real world these things matter. So when a religion professor and Episcopal priest like Jones writes a book about science, I am not particularly optimistic he will have anything interesting to say. And I can be pretty confident I know his biases ahead of time. The same goes for staff editors of the New York Times without scientific training.

Friday, February 05, 2016

3.37 Degrees of Separation

This is pretty interesting: Facebook has a tool that estimates the average number of intermediate people needed to link you, via the shortest path, to anyone else on Facebook. Mine is 3.37, which means the average path length (number of links) to me is 4.37, or that the average number of people in a shortest chain connecting others with me (including me and the person at the end) is 5.37.

What's yours?

An interesting aspect of this is that they use the Flajolet-Martin algorithm to estimate the path length. The paper of Flajolet-Martin deals with a certain correction factor φ, which is defined as follows: φ = 2 eγ α-1, where γ = 0.57721... is Euler's constant and α is the constant Πn ≥ 1 (2n/(2n+1))(-1)t(n), where t(n) is the Thue-Morse sequence, the sequence that counts the parity of the number of 1's in the binary expansion of n.

The Thue-Morse sequence has long been a favorite of mine, and Allouche and I wrote a survey paper about it some time ago, where we mentioned the Flajolet-Martin formula. The Thue-Morse sequence comes up in many different areas of mathematics and computer science. And we also wrote a paper about a constant very similar to α: it is Πn ≥ 0 ((2n+1)/(2n+2))(-1)t(n). Believe it or not, it is possible to evaluate this constant in closed form: it is equal to 2 !

By contrast, nobody knows a similar simple evaluation for α. In fact, I have offered $50 for a proof that α is irrational or transcendental.