14
StupidFilter
I’ve known about this project for some time, but I had to bring it up after I watched a few YouTube videos after work. I accidentally read some of the comments below the videos and my eyes started burning it made me remember StupidFilter.
StupidFilter is essentially a spam or junk filter but for stupid text. If you play around with the demo, you can see how it works. If you like pain, you can check the random page to see all of the random comments StupidFilter has stored in their database thus far.
I can’t help but think how useful StupidFilter would be to Google and YouTube. While Google must use a large amount of storage for inane response videos, the text comments that get stored must use a decent amount of storage at YouTube’s scale too. It would save Google money and it would make YouTube comments easier on the eyes.
7
Good-bye, Google (Part II)
In Part I of this post, I gave my reasons for abandoning Google Search (as a step to abandoning Google as a whole). In reality, those reasons were only the spark that got me looking at alternatives, and if the alternatives hadn’t been so good, I would have probably returned, but the fact is, there are some other good search engines out there.
read more
7
Good-bye, Google (Part I)
Last week Google announced they would be personalizing everyone’s searches regardless of whether they were logged in or not. I’m not a privacy nut, but this just seems wrong to me. I know, you can opt-out of the service, but how many people even know their searches are being personalized are are internet savvy enough do it? (Almost anyone who reads this blog probably is, but that’s not to say you’re the average internet user.)
Google news of late has gotten me thinking about search engines and what they’re supposed to do. As I mulled it over, I realized I’m not really interested in which web pages tell me what I want to know, I’m interested in what I want to know. Google only gives specific answers if you ask for something Google Calculator knows how to deal with. (Additionally, I often have the problem that a lot of results Google gives me are irrelevant to my search–many times I cannot even find my search terms on the page using the browser’s Find functionality.)
At the same time, I am starting to be concerned about putting all my eggs in one basket: Google Search, Gmail, Google Reader, Google Maps, the list goes on. Earlier this year, Google broke the internet (which could happen to any search engine) and accidentally shared Google Documents with contacts users hadn’t given permission to access (which could have happened to any online text editing service). Incidents like this are subtle reminders to me that it may not be a good idea to put too much trust in a powerful company.
A testamonial about interviewing for Google changed my perspective on the Mountain View company. The number of questions the interviewers asked that focused solely on advertising revenues (2 out of 4) made me associate Google less with internet services and more with advertising companies like Lamar. It struck me that they’re less interested in providing great products and more interested in providing products that increase their ad revenues, which, while perfectly fine, makes them seem lame.
It hasn’t taken long for me to start relate to articles like this one which question whether Google is friend or foe, and articles like this one and this one sparked my interest in Bing, which I’d played with when it premiered but didn’t find much different from Google. As the second article points out, Bing is starting to provide answers where it can rather than mere links, a tactic Google might avoid, since its business strategy is so wound around advertising revenues. It might not be a good business decision to start giving the answers rather than driving traffic to sites with the answers. While Bing makes its money from advertising, it’s backed by Microsoft, which has a pretty good source of revenue through other means; Google doesn’t.
Those are the high points of my case against Google, the points that have convinced me to start looking at alternatives. I’ve abandoned Google Search (which I’ll discuss in Part II of this post), and my experience with Google Wave lasted all of a week: it sounds cool, and it looks cool, and I want it to be cool, but it’s just not. I’m sticking with GMail and Google Reader for the moment, but when I start a new job in a couple weeks and no longer have wireless during the day, being able to access my RSS feeds on my iPod Touch will be much less important, and I’ll probably switch to a desktop aggregator. In the meantime, GMail is awesome, but I do hope a Google Glitch doesn’t start showing my emails in Google Search results, and if I find a good alternative, I’ll have to seriously consider making the change to it (if you have any suggestions, let me know in the comments).
For a rundown of alternatives to Google Search, jump to Part II of this post.
7
How many users on Twitter?
Twitter has been pretty strict in making sure that the public is not aware of how many users they have and this is probably for good reason. I really think the amount of hype that Twitter has had is not justified. So then, the question I set out to answer is quite simple and that question is, how many twitter accounts are there?
So from that point there is another question that must be asked and that is, who would know the information I’m looking for, other than Twitter?
So first I google’d around a little and I found this article on TechCrunch which had some excellent information on the subject. Information about how user ID numbers on Twitter do not reflect the number of users that are on Twitter.
5
Bing.com blunder: Page 21 goes blank (FireFox)
Bing.com blanks out at page 21-22 of search results in FireFox.
Take a simple search in bing.com
http://www.bing.com/search?q=ford
This will give you page saying there are 160,000,000 results and you’re being shown 1-20 of them. That’s all fine and dandy but let’s say I want to just flip through the pages a little and see what’s new with Ford.
When you click on the pagination below the URL’s that your going to look something like this…
Page 2: http://www.bing.com/search?q=ford&first=6&FORM=PERE
Page 3: http://www.bing.com/search?q=ford&first=16&FORM=PERE1
Page 4: http://www.bing.com/search?q=ford&first=26&FORM=PERE2
Page 5: http://www.bing.com/search?q=ford&first=36&FORM=PERE3
You’ll notice a couple things changing in these urls. The first should be the “first” GET variable in the URL which appears to be some kind of page offset for the results.
The next is that FORM=PERE business which I’m not sure exactly what it does and it didn’t seem to matter if I removed it from the URL anyway.
The key thing I found was that if you click through the pages until you get to around page 21 or 22 the screen goes completely blank and the magic variable value is changing “?first” to a value that’s greater than or equal to 200. This seems to only be an issue in Firefox as far as I can tell.
I checked the headers with this and there doesn’t seem to be any significant differences.
$ curl -I http://www.bing.com/search?q=ford&first=16&FORM=PERE4
$ curl -I http://www.bing.com/search?q=ford&first=206&FORM=PERE4
Comment if you have any ideas?

