Volunia – Updates (for the last time…)

Some readers are asking me on Facebook and via email to post some updates about Volunia, the search engine i’ve already analyzed and blogged about.

So, let’s try to write something. But, before starting, I have to say this: the point is that… there are no updates. It’s so simple.

As you know, I’ve been monitoring the growth of the search index for a while, but i’ve stopped checking because it simply was not growing. This is the graph:

The only maybe-interesting news is that they have launched a new feedback system (basically, a Q&A CMS), probably due to the need of a more efficient system to manage users feedbacks (with the old “closed” system, i’d bet, they were receiving tons of duplicated feedbacks without any way to rank and clean them).

I’ve tried to speak about good and new features of Volunia, but right now, 2 months after its pre-launch, it looks more as a failure than a revolution: the online rumors about this search engine are now silent and the media interest in it has finished.

Being the interest in it is almost dead, I don’t think I will blog again about Volunia. Do you think this “revolutionary” search engine is still alive?

From my point of view, i’d just consider it as “dead” and focus on other emerging engines.


Is Volunia crawling the web?

Are Volunia bots crawling the web? The answer should obviously be… “yes”. But, actually, they don’t seem to be doing their work. Have a look:

This graph is showing the number of search results returned for some common keywords in the last 2 days. I’ve set my Search Language to “Italian and English” and the SafeSearch Filter to “Medium”. This is the data table:

The number of results returned for those keywords has slightly changed. How is this possible?

If you look more carefully at that table, you’ll notice that the number of results isn’t actually changing at all. It seems that my searches are run against 2 different versions of the indexes (look at the number of results for “snow”: it’s 1618056 on 10th February, then it becomes 1583862 in the 11th February morning and then back to 1618056 in the afternoon).

A search for “vada a bordo cazzo”, well-known De Falco‘s exclamation, will result in… no results if you use the double quotes, and some porn sites if you don’t use them. Wow! A little outdated index, isn’t it?

Is VoluniaBot taking a break?

UPDATE 20:45 13/02/2012: Yes, Voluniabot did woke up (in the last 10 hours). Look at this graph:

Volunia – A quick update

This i just a (very) quick update about my tests on Volunia: yesterday I focused on network activity and front end configuration, so that i’ve tried to focus today’s tests on search results and common gui errors.

The first thing i’ve noticed: searches are quickly improving. There are many more results, and Volunia’s team has reported that web indexing has been made quicker. In order to “measure” the index growth, i’ve decided to create a table and a graph where i’ll periodically report the number of results shown for some test words. I’ll speak about this in the next days. Well, don’t expect Google-level results in 2 days, but… It’s improving.

I’m a bit concerned about slowness. Searching for “I’m just trying to stress you” takes more that 8 seconds (and the sistem is not yet under heavy loads). But that’s not the worst point. Searching for those words, will lead to this results page:

Can you see that HTML encoding error?

Something similar happens if you search ” “Giorgio Bonfiglio” ” (yes, this time with the double quotes): Volunia will show only one result.

Now, let’s try to click on that “repeat the search with…” link. This is the result:

That’s terrible. It’s not just about an HTML encoding error: Volunia is showing results matching on… that error! Moreover, look at the third result: is Volunia really showing as third result a page where the only matching item is a first name?

Volunia for geeks

No, you’re not drunk. This is my first english post: i planned to start writing in english a while ago but have never done that, as my audience is mainly italian. But i’ve decided to try and see what happens.

I’ve already written about Volunia, before and after trying it out. I plan to go deeper in details in the next few days, but now I want to show you some “geeky” things i’ve noticed in this search engine. I’m a sys/net admin, you know, so i couldn’t avoid opening WireShark to check what Volunia was doing and sending trough my computer (yes, this is the first time i’m really concerned about my privacy).

The first point: altough all POST data (profile and so on) is sent trough HTTPS to secure.volunia.com, the chat system (both public and private) is using Jabber trough HTTP (chat.volunia.com).

Searches too are using GET trough HTTP. This could be a concern, so I hope a full-HTTPS version will be released in the next months (they’re years that Google is available over HTTPS).

Second point: do you see something strange in the following screenshots (click on it to see a larger version)?

Have you noticed that “wp-content”? It’s the default WordPress directory to store themes and uploads! That’s… strange. Looks like it’s only used for their news page (http://en.volunia.com/news/), but i’m not sure: why leave the WP login page open to the world? It’s just matter of some .htaccess lines.

Hope WordPress is up-to-date, at least.

I decided to check active connections using “netstat” and this way I noticed that… The Volunia team doesn’t know what PTR records are. All their IP are still using the default Tiscali reverse records. Not a real concern, right, but a proper reverse dns records use makes netadmin’s work of monitoring their networks simplier. You can check what your users are connected to with a click, configure your firewalls to always accept outgoing traffic for hosts whose PTR ends in *volunia.com, and so on.

Anyway, this forced me to further study DNS records. I discovered that, while the main Volunia website (www.volunia.com) is behind Level3’s CDN, other services aren’t.

Both chat.volunia.com and secure.volunia.com are located in Italy. Is this only for testing purposes or is the system set to be used by the public this way?

Latency could be another concern, as I said in my first post: Italy is not the best place to put servers that need to be reachable all over the world. But, let’s look at NS records:

Do you see that? Those are default NameCheap (or Enom)’s DNS servers. Why not use an in-house solution or a more professional service, like dyn.com or Route 53? Everything but not lowcost services, please: Route53 is only $6 per year!

Some other random notes:

Received: from [] (pitps004.volunia.net [])
	by pitsrv03.volunia.net (8.13.8/8.13.8) with ESMTP id q19F4dJj031611
	for <GiorgioBonfiglio>; Thu, 9 Feb 2012 16:04:39 +0100

Ouch. I tought they were planning to grow and become a little bigger thanĀ :(

I’m also wondering how many Power Users have been given access to Volunia as of today. Marchiori spoke about 100.000 PU, but, Volunia’s statistics tell a different story. How many users are using Volunia if only 200 are in the homepage?

Finally, I STRONGLY hope this “.NET” is just a “fake” header to prevent sites from blocking their bot and that Volunia is not Windows-based. Windows is killing big websites. - - [09/Feb/2012:11:59:02 +0100] "GET /volunia.txt HTTP/1.0" 200 33 "-" "Mozilla/5.0 (Windows; U; Windows NT 5.1; it; rv: Gecko/20100625 Firefox/3.6.6 ( .NET CLR 3.5.30729)"

I’ve reported to Volunia’s team everything the WordPress thing and asked to properly set PTR records.