Task

The task is to create a PHP application that fetches data from remote websites, parses HTML code and displays the result in a simple way, using HTML elements. We want to test your ability to understand basic web programming techniques, and to write clean, high quality code.

From page http://www.wikidot.com/ get sites from the "Featured Sites" section. For each of the sites go to <SITE_URL>/system:members and count number of members. Show the result in a simple table:

Site Link Number of members
Cocktails http://cocktails.wikidot.com 1

To prevent fetching data with each request to your script, store results in a local cache (e.g. Memcached).

We care about the code you write. Do not add fancy features. You get points for efficient design, accurate understanding of the task, functional minimalism, and clarity and expressiveness. Comments are not a substitute for readability.

We will stress-test your code, running 10, 100, and 1000 concurrent connections, to check for performance bottlenecks and scalability, by running ab -n 10000 -c 10/100/1000 http://localhost/your_script.php.

If you have difficulties or questions with any aspect of this exercise, you should ask us for help (leave a comment or send an e-mail): that will not be counted against you.

Environment

  • You can assume the Memcached server is running on 127.0.0.1:11211
  • All standard PHP extensions are available, including all XML parsing libs (DOM, SimpeXML etc).

Deliverables

You should produce:

  1. A small design document (README) that captures:
    1. your understanding of the problem, and
    2. your proposed architecture.
  2. A few test cases written in shell scripting

Pack those in tar.gz and send to moc.todikiw|rtoip#moc.todikiw|rtoip

Deadline

We are waiting till Friday, 9th Oct, noon (CEST).

Have questions? Ask them here! We are eager to help!

michal frackowiakmichal frackowiak 1254468967|%e %b %Y, %H:%M %Z|agohover

Here is a small hint for developers:

The displayed result does not need to be shockingly up-to-date. The information listed on the main page (http://www.wikidot.com) is also cached for some period of time.

It should be more important to create an efficient way of fetching and displaying data and handling dozens of req/s than displaying up-to-date content. This is why we encourage using Memcached.


Michał Frąckowiak @ Wikidot Inc.
Visit my blog at michalf.me

last edited on 1254471298|%e %b %Y, %H:%M %Z|agohover by michal frackowiak + show more
unfold by michal frackowiakmichal frackowiak, 1254468967|%e %b %Y, %H:%M %Z|agohover
pliskenplisken 1254653254|%e %b %Y, %H:%M %Z|agohover

Just of the curiosity — this "Featured Sites" section is where exactly?

unfold by pliskenplisken, 1254653254|%e %b %Y, %H:%M %Z|agohover
GabrysGabrys 1254656929|%e %b %Y, %H:%M %Z|agohover

@plisken: Featured Sites section is on the very center of http://www.wikidot.com/


Piotr Gabryjeluk
visit my blog

unfold by GabrysGabrys, 1254656929|%e %b %Y, %H:%M %Z|agohover
Benchmark ;>
TeRqTeRq 1254686628|%e %b %Y, %H:%M %Z|agohover

Hello,

testing environment:
CPU: AMD Athlon 2000+
RAM: 768MB DDR1
HDD: 40GB Seagate Baracuda

My test :D

Cant make 10000/1000… its kiling my PC :>

Now writeing readme, then pack everything and sending it to wikidot.com

gl ALL

last edited on 1254829605|%e %b %Y, %H:%M %Z|agohover by TeRq + show more
unfold Benchmark ;> by TeRqTeRq, 1254686628|%e %b %Y, %H:%M %Z|agohover
michal frackowiakmichal frackowiak 1254728036|%e %b %Y, %H:%M %Z|agohover

A small hint is that if

1. the data is already in the cache,
2. you optimize the part of your script that displays the data

you could get a few hundreds of req/s quite easily. But performance aside, we really look much more at overall design (even with such a small task) and elegance of code, which reflects your coding habits.


Michał Frąckowiak @ Wikidot Inc.
Visit my blog at michalf.me

unfold by michal frackowiakmichal frackowiak, 1254728036|%e %b %Y, %H:%M %Z|agohover
TeRqTeRq 1254731085|%e %b %Y, %H:%M %Z|agohover

I‘m not sure, is that hint for me…
if so, the higher performance on my PC its a some kind of abstraction :>
ma machine handle ~400req/s on
<?php echo ’Hello world.'; ?>
script ;)
so 80req/s its an magnificent result
Of course in my opinion.

Yes, I could throw out OO api (from main page cache) and use some inline scripting but is it wourth ??

regards

PS. Script was already send to moc.todikiw|rtoip#moc.todikiw|rtoip

unfold by TeRqTeRq, 1254731085|%e %b %Y, %H:%M %Z|agohover
michal frackowiakmichal frackowiak 1254733565|%e %b %Y, %H:%M %Z|agohover

Well, not necessarily to anyone particular, but we have solutions that go down below 1 req/s. Honestly I have not seen your solution yet (Piotr did not share it, still waiting), but 80 req/s is reasonable, especially if you machine can do 400 max.

And we have also written our own solutions to this task and we are getting ~ 800 req/s when data is in the cache, on a dual-core 2.6GHz machine. But there are many external factors, hardware and software.

We will go through it, probably together with authors of solutions. No worries.

BTW: are you using APC or eAccelerator?


Michał Frąckowiak @ Wikidot Inc.
Visit my blog at michalf.me

unfold by michal frackowiakmichal frackowiak, 1254733565|%e %b %Y, %H:%M %Z|agohover
TeRqTeRq 1254734544|%e %b %Y, %H:%M %Z|agohover

@michal frackowiak

BTW: are you using APC or eAccelerator?

no i dont… (as I know)

could u past me benchmark from U testing Environment 10000/100
I`m so curious

unfold by TeRqTeRq, 1254734544|%e %b %Y, %H:%M %Z|agohover
michal frackowiakmichal frackowiak 1254735157|%e %b %Y, %H:%M %Z|agohover
ab -n 10000 -c 100 http://127.0.0.4:8081/task.php

This is on a 64-bit Mac. FastCGI and eAccelerator might be adding some extra boost here, but this setup is new so I did not bother to much tuning. The real application have other bottlenecks.


Michał Frąckowiak @ Wikidot Inc.
Visit my blog at michalf.me

unfold by michal frackowiakmichal frackowiak, 1254735157|%e %b %Y, %H:%M %Z|agohover
TeRqTeRq 1254735682|%e %b %Y, %H:%M %Z|agohover

@michal frackowiak
thx,
but I thought about my script on U testing Environment ;) [starting with clean cache]
(test was mailed from: lp.qret|tkatnok#lp.qret|tkatnok)

unfold by TeRqTeRq, 1254735682|%e %b %Y, %H:%M %Z|agohover
michal frackowiakmichal frackowiak 1254746546|%e %b %Y, %H:%M %Z|agohover

@terq
I did some tuning to my PHP config, and see the whooping performance of your solution:

I cannot complete the test with -c 100 the cache is not populated, because I am getting apr_poll: The timeout specified has expired (70007) — probably to many concurrent outgoing connections and the thing is not reliable. Also could be Mac-specific. Strange, since I am using only 10 PHP processes to handle the traffic. Should work.

Anyway, nice performance.


Michał Frąckowiak @ Wikidot Inc.
Visit my blog at michalf.me

unfold by michal frackowiakmichal frackowiak, 1254746546|%e %b %Y, %H:%M %Z|agohover
michal frackowiakmichal frackowiak 1255077511|%e %b %Y, %H:%M %Z|agohover

Still 82 minutes left for submissions…


Michał Frąckowiak @ Wikidot Inc.
Visit my blog at michalf.me

unfold by michal frackowiakmichal frackowiak, 1255077511|%e %b %Y, %H:%M %Z|agohover
TeRqTeRq 1255086977|%e %b %Y, %H:%M %Z|agohover

time out :>

unfold by TeRqTeRq, 1255086977|%e %b %Y, %H:%M %Z|agohover
michal frackowiakmichal frackowiak 1255089676|%e %b %Y, %H:%M %Z|agohover

Yep, we are digging through solutions — we will try to meet a few authors next week.

Thanks for all the submissions!


Michał Frąckowiak @ Wikidot Inc.
Visit my blog at michalf.me

unfold by michal frackowiakmichal frackowiak, 1255089676|%e %b %Y, %H:%M %Z|agohover
Submitting my solution
slawekwuslawekwu 1255094036|%e %b %Y, %H:%M %Z|agohover

After the T/O but anyway :)

4000/5000 req / sec, C2D, no matter cache is populated or not ;)

Concurrency Level:      100
Time taken for tests:   2.148 seconds
Complete requests:      10000
Failed requests:        0
Write errors:           0
Total transferred:      33527250 bytes
HTML transferred:       31050000 bytes
Requests per second:    4654.52 [#/sec] (mean)
Time per request:       21.484 [ms] (mean)
Time per request:       0.215 [ms] (mean, across all concurrent requests)
Transfer rate:          15239.59 [Kbytes/sec] received
unfold Submitting my solution by slawekwuslawekwu, 1255094036|%e %b %Y, %H:%M %Z|agohover
Add a new comment
2007-2009 Copyright Wikidot Inc.