Showing posts with label computers. Show all posts
Showing posts with label computers. Show all posts

Friday, March 02, 2012

Google goes stupid

Yesterday, at work, I was using Google Image Search to try to find an example of a particular kind of scientific figure. I wanted a figure where any variable has been measured over ages from early in life to late in life, and where the data are U shaped (start out high, go down, then go back up) but where someone had fit a simple linear regression to these data, completely missing the pattern. I see such figures fairly often, but couldn’t remember exactly where. I also wanted examples where they had got it right and fit something other than a straight line.

So I used search terms like: “linear regression” “scatter plot” age and figure. And I got back images of babies, mothers breastfeeding, pornographic images of women (not medical or educational or artistic, despite having Safe Search set at strict) mixed with a smattering of scatter plots, almost all from papers about women’s reproductive health, breast- and cervical-cancer and baby nutrition. None of it was useful to me, and a lot had no apparent link to any of my search terms. What was going on?

At the same time at home, my wife was reading a Blogger hosted blog on women’s reproductive health and baby-care. She was using my laptop and was still signed into gmail as me. Blogger is owned by Google. Yesterday was the first day of Google’s new policy of sharing information about users of any of its services with any of its other services to help to personalize search results and the ads it shows. I was looking for figures. My wife was reading about women, their reproductive health, and their babies. I got images of babies, women’s figures and reproductive parts, and scientific figures plotting data on same. If I was interested in scatter plots related to breast-cancer, I would include “breast-cancer” in my search terms. Yesterday, I logged out of my Google account in order to get decent results from Google. This is not helpful for Google or for me.

There has been a big kerfuffle about the lack-of-privacy implications of this new policy. People say that Google is becoming Facebook, a sinister ploy to know more about you than you do. I am sympathetic to those complaints, but not particularly worried about them for my own sake. My main complaint isn’t that the new policy is evil (i.e., Facebookish), but that it is stupid (i.e., Microsoftian), in that it makes Google’s core product, Search, less useful. Google is forgetting what Microsoft has never figured out: Too many functions doing things for you detracts tremendously from the functionality of the software. If I want to sit down at a new computer and use Word, I don’t want to first have to individually turn off all twelve parts of Autocorrect and 75 other things that will change my document for me in undesirable ways. And if I want to search for something, I don’t want terms I haven’t inserted invisibly added to my search. To have the algorithm decide to include terms from a blog that I (or anyone logged into the same account) read in my search takes control away from me, and then I have to fight against the algorithm to find what I want (or, heaven forefend, use Bing). It’s like having a car that tries to drives you to a restaurant of its choice every time anyone in the car, or on the radio, mentions food. You drive because you have to, but after the tenth visit to that terrible restaurant, you wish you could just walk.

The Google–becoming-Facebook complaints are surely in part a reference to Google’s relatively new social network site, Google+. But here again, it feels to me more like Google-becoming-Microsoft. My complaint stems from their push to drive every possible bit of traffic to Google+, even if this makes their other products harder to use. Microsoft similarly tried (before being ordered otherwise by the courts) to so thoroughly integrate Internet Explorer into Windows that you basically couldn’t use one without the other. Google seems to be trying to do the same with all their products and Google+. The‘Photos’ link used to take me to my photos on Picasa. This was helpful. Now it takes me to the recent photos that people in my circles have uploaded to Google+. This is not generally useful, and makes it harder to get to my pictures. On pretty much all their products the Share button (which used to take me directly to the option to email or link to an item) now opens a window asking me to post whatever it is on Google+, making it harder to share whatever it is with the vast majority of people I communicate with through other means. I understand that Google wants me to communicate with all of them only through Google+, but as that is never going to happen, designing a product that assumes it has already happened is stupid. There are quite a few of these small things that don’t make a big difference in and of themselves, but take Google’s products, which have traditionally been beautifully designed and implemented one step closer to things I use because I have to. And once you are into “use because I have to” territory, you are Microsoftian.

Tuesday, September 06, 2011

Dan SMASH!

I've just spent six hours trying to convert a file (a poster I'm presenting at a conference next week) from one standard format to another for printing. After six computers and dozens of programs each of which is supposed to be able to do this conversion instantly, it finally converted, although it looks kind of crappy in its converted format. I've sent it to the printer because I don't care anymore. None of the other things I desperately needed to get done today got done.

I am not generally given to violence, but do currently have the urge to break something.

Thursday, October 08, 2009

Language Learning

Today Iris and I are each trying to learn a language. Iris is out, walking around Rostock, investigating the various schools in town that offer German classes to auslanders. I am focusing on learning a much more broadly used language, R. Now Germany is certainly used by more people than is R, and R isn't really anybody's first language, but R is used all around the world, and by a surprising range of people. Yesterday a colleague who has been in Germany for a year and not yet learned German said to me, "I'm not staying in Germany forever, and German isn't going to do me a whole lot of good outside of a few countries, but R I will need for every job I might ever have."

R is a simple programming language intended for statistics and data analysis. It is rapidly becoming the standard for advanced data analysis, in the natural and social sciences, from advanced college students to statistics professors, and in every country where people with internet connections need to analyze data.

Back in the 1970s, Bell Labs developed a programming language called S (for "statistical") and somehow, in the mid '90s had the wisdom to release an open source version of it, called R. R had the wonderful property of being easy to extend. Any user can, invent new words for this language and tell the computer exactly what to do when users used those words. This is equivalent to English's allowance of the sentence, " From now on, let's use the word reflop to mean 'to flip something over, and then flip it back to its original position.'" Users can also find something they don't think works well, look at the underlying language, and tell the computer, "from now on, I want this word to mean X, not Y as it did before." Users have added and modified Graphical User Interfaces, make implementations that work inside other programs, and compiled packages for every major operating system.

These extensions and modifications can be uploaded to the R website, and other users can decide which bits and pieces they want. Every once in a while a pre-fab version is released, with all the most recommended bits and pieces, and with someone having checked that they all work well together. So every user is necessarily a programmer, and every programmer can fairly straightforwardly improve on the model. It is as though every user of an open source browser such as Firefox in learning how to use the browser also necessarily learned how to make improvements to the browser. By this model R quickly and clearly outstripped S and S+. I am sure there is someone out there who still uses S, but not many. R is more versatile, more widely used, has elegant add-ons in fields from architecture to phylogenetics, and is entirely free. It's the feel good statistics package of the decade, and a serious threat to the business model of anyone who makes money selling data analysis software (which can often cost hundreds of dollars for a single user).

At the Max Planck Institute for Demographic Research, where I've recently started working, everybody uses R. The simulations are in R, the data queries are in R, statistics are in R and the figures and graphs are created in R. R is more necessary than German for anyone at the Institute. Which is why I am dedicating the next couple of weeks to learning it. As with any language, the largest part of learning R is trying to using it, failing to be understood, and trying again. So I've given myself a task, outlining in English a simple simulation I need to perform for a paper I'm revising. Programming this requires about 50 steps. So far I've figured out the first three, and I'm stumped on the fourth. Even so, I think my R is already better than my German.