--- Log opened Thu Feb 26 00:00:26 2009
00:11  * mae thinks that the user should have to explicitly say whether they want something to match POST or GET
00:11 < mae> if they don't care, they can put _
00:12 < mae> (or all the other REST commands too for that matter)
04:26 < mae> something is funky with the response codes returned from the guestbook app
04:27 < mae> it is returning a 303 ona  page thata should be a 200
04:27 < mae> I think there is a problem with the way the overlaid post/get requests are
04:27 < mae> for some reason 303 is set even when it is a GET and not a POST
04:43 < stepcut> mae: which page is that ?
04:44 < stepcut> getEntries and getREADME do not explicitly call, 'ok', maybe they should. I am not really sure what the expected behavior is there
04:45 < h_buildbot> Build for ghc-6.8.3 OK. Test suite ran. 15/77 test cases failed. Check http://buildbot.happstack.com/ for details.
04:49 < mae> stepcut: yeah, I think the problem is that, for some reason the first line methodM GET >> seeOther "/entries" (toResponse ()) still sets the response code to 303
04:49 < mae> which is weird
04:49 < mae> because methodM GET should end up with mzero when it is /entries
04:49 < mae> ahh then again maybe not
04:49 < mae> gosh the semantics are so funky
04:50 < mae> anyways, try firing up the guestbook with the logging I added
04:50 < mae> you can see alot of info:
04:50 < mae> - [26/Feb/2009:09:30:01 +0000] "GET /entries 1.1" 303 4052 "" "Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv: Gecko/2009011913 Firefox/3.0.6"
04:50 < stepcut> mae: I can't duplicate this behavior
04:50 < mae> see how it is a 303 response code for /entries ?
04:51 < mae> stepcut: did you pull the patch I just pushed up and then 'cabal install' on happstack-server ? It tells you the response code.
04:51 < mae> All the fileServe stuff has the correct response code
04:53 < stepcut> hold on, I have to install hstringtemplate
04:53 < h_buildbot> Build for ghc-6.10.1 OK. Test suite ran. 15/77 test cases failed. Check http://buildbot.happstack.com/ for details.
04:54 < mae> the idea is now we should be able to use something like this: http://awstats.sourceforge.net/
04:54 < stepcut> yeah
04:54 < stepcut> i'm building the latest now
04:54 < mae> k
04:55 < mae> its also useful to see what response codes etc are being sent for each request :)
04:55 < mae> I implemented it at the lowest level
04:55 < mae> in the handler
04:55 < mae> so nothing short oomkiller should stop any request from getting logged :)
04:55 < mae> (or the runtime dies for some reason)
04:57 < stepcut> works fine for me:
04:57 < stepcut> - [26/Feb/2009:09:56:35 +0000] "GET / 1.1" 303 0 "" "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_5_6; en-us) AppleWebKit/525.27.1 (KHTML, like Gecko) Version/3.2.1 Safari/525.27.1"
04:57 < stepcut> - [26/Feb/2009:09:56:35 +0000] "GET /entries 1.1" 200 3735 "" "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_5_6; en-us) AppleWebKit/525.27.1 (KHTML, like Gecko) Version/3.2.1 Safari/525.27.1"
04:57 < mae> hmm
04:57 < mae> strange
04:57 < mae> maybe its firefox
04:58 < stepcut> - [26/Feb/2009:09:58:01 +0000] "GET /entries 1.1" 200 4042 "" "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv: Gecko/2009011912 Firefox/3.0.6"
04:58 < stepcut>  
04:58 < mae> maybe its firefox
04:58 < mae> arg
05:00 < mae> lol
05:00 < mae> what the hell
05:00 < mae> is my code sick
05:00 < mae> - [26/Feb/2009:09:59:17 +0000] "GET /entries 1.1" 303 4052 "" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0; WOW64; SLCC1; .NET CLR 2.0.50727; .NET CLR 3.0.04506; Media Center PC 5.0)"
05:00 < mae> happens on IE too
05:00 < stepcut> you broke it :)
05:00 < stepcut> it's 4AM, I am going back to bed
05:00 < mae> dude
05:00 < mae> it works for you wtf?
05:01 < stepcut> yeah
05:01 < mae> is this the guestbook app in the repository?
05:01 < stepcut> yeah
05:01 < mae> ?!
05:01 < mae> very very strange.
05:01 < mae> ok well
05:02 < mae> ttyl, i will continue tommorow :)
05:05 < mae> don't know what was broke
05:05 < mae> but i did a full clean, and then full build/install of everything
05:05 < mae> works now
05:06 < stepcut> yay?
10:43 < mae> yeah YAY!
10:59 < mightybyte> Yay what?
11:09 < lanaer> mew?
15:06 < wchogg> mae : Is it possible that we could push the release back to thursday or friday instead?
16:12 < wchogg> Heh...I accidentally fed simpleHTTP a ServerPartT that was equivalent to "seeOther "foo" "" `mappend` otherstuff".  Doesn't exactly work, now does it?
16:12 < stepcut> :)
16:19 < Saizan> that's the same as seeOther "foo" "", right?
16:20 < wchogg> Right.
16:20 < wchogg> It short circuits.
16:20 < stepcut> the cheese is old and moldy
16:21 < wchogg> huh?
16:32 < lanaer> tasty cheese
16:32 < wchogg> Apparently it's dadaism awareness day on #happs
16:36 < lanaer> :D
21:28 < mae> har!
21:36 < wchogg> mae: yar?
21:42 < mae> wchogg: why are you worried about the release day?
21:44 < wchogg> mae : Oh, because I was getting skittish about my ability to get all the stuff I wanted to done before weds.
21:45 < wchogg> mae : I've spent the whole day working on happstack/tut stuff so I'm not as nervous.
21:45 < mae> ok ):
21:46 < wchogg> BTW, if you pull the current darcs HAppSHelpers & happstack-tutorial then you could see the multimaster chapter.
21:46 < wchogg> mae : I'm not sure how much detail I should include.
21:46 < wchogg> mae : but I think it's enough to get started
21:47 < mae> tutorial means lots of detail usually
21:48 < mae> but its up to you how much you want
21:49 < wchogg> mae : well, I tried to give a decent conceptual model about what needs to happen.
21:54 < wchogg> mae : if there's something you feel like wasn't answered in terms of "get you started" information, please let me know.
21:58 < mae> http://blog.happstack.com/2009/02/26/happstack-now-outputs-apache-combined-logs/
21:59 < wchogg> cool
21:59 < mae> so i am looking at happstutorial
22:00 < mae> where is the multimaster bits?
22:00 < mae> (i noticed that you updated alot of content besides that, thank you)
22:00 < wchogg> I put the chapter down at the bottom of the table of contents until I know where I want to stick it.
22:03 < mae> m
22:03 < mae> what is it called?
22:08 < wchogg> Sorry, lost internet for a sec.
22:11 < wchogg> I'm not sure why you wouldn't be seeing the chapter in the table of contents if you have the current darcs.
22:34 < jsn> hello
22:34 < jsn> i a curious about the consensus algorithm you guys use
22:34 < jsn> we are chatting about it in #haskell-in-depth
22:35 < Saizan> what i know is that the transactions get total-ordered by the spread network
22:36 < jsn> okay
22:36 < jsn> in fact, total ordering of messages is just the same problem as consensus
22:36 < Saizan> yeah
22:36 < jsn> so we would need at least two rounds there, as well
22:37 < jsn> and it is very important to know, what happens to nodes that do not get the message or are lagged?
22:37 < jsn> that's the other thing -- if you do not handle lagged nodes, you have no way to actually handle a failure
22:38 < jsn> i looked at this all for a long time last year
22:39 < jsn> my boss was convinced we could handle all credit card transactions in a distributed manner on top of AWS
22:39 < wchogg> Saizan : do we really do anything at this level or is it all just handled in the Spread daemon?
22:41 < jsn> does the spread daemon just push messages to all other nodes?
22:41 < Saizan> wchogg: i think lagged/failed nodes are excluded by the group by the spread daemon, though i guess we also have to handle what happens when they recover.
22:41 < jsn> is there only one spread daemon?
22:41 < Saizan> no, each server has its own
22:41 < jsn> ah
22:41 < jsn> well, to have an accurate list of failed processes, you have to solve consensus
22:42 < jsn> otherwise, some spread daemons will push things to nodes that other ones think have failed
22:42 < jsn> &c.
22:42 < jsn> so the dumbest way to do consensus is this:
22:42 < Saizan> well, yeah, we delegate solving consensus to spread
22:42 < jsn> you push a request to everyone
22:42 < jsn> then everyone pushes their votes to everyone else
22:43 < jsn> everyone tallies the votes to know whether to commit or not
22:43 < jsn> if a node does not get a majority, it goes to sleep
22:43 < jsn> well, okay, but we need to know if spread actually solves it or not
22:44 < jsn> oh, in the above, there are "rounds" of fixed length -- in the first round, we get the requests that we vote on in the second round
22:44 < Saizan> as in they have proved so?
22:44 < jsn> yes
22:44 < jsn> i can root around for the proof
22:45 < jsn> also, there are some details about clock synchronization in all this
22:45 < jsn> a node knows to go to sleep because it waited for the second round, did not get a majority of votes in time and went to sleep
22:45 < jsn> if its clock is off, it will do crazy stuff
22:45 < jsn> however, that is unavoidable
22:46 < jsn> because -- dues to FLP, 1985, http://groups.csail.mit.edu/tds/papers/Lynch/jacm85.pdf -- you can not solve consensus with one faulty process in an asynchronous setting
22:46 < Saizan> heh, reading about distributed consistency you get lots of "however that's unavoidable" situations :\
22:47 < jsn> well, distributed computing is very bad
22:47 < jsn> it is like you lose a bunch of the computer
22:47 < jsn> it's memory has random contents and it may or may not perform the computation
22:47 < jsn> s/it's/its/
22:48 < jsn> so, i have actually put a lot of personal time and effort into fleshing out a system based on vote broadcast
22:48 < jsn> that is not the one that most people have put effort into, though
22:48 < jsn> because the message cost is really high
22:48 < jsn> O(n^2)
22:49 < Saizan> that's also called the king algorithm?
22:50 < jsn> no
22:51 < jsn> the king algorithm -- which i'd not seen before -- is for byzantine generals
22:51 < jsn> i honestly don't know what the name is for broadcast voting is
22:52 < jsn> s/voting is/voting/
22:52 < Saizan> ah, right, here we assume nodes won't send wrong informations, but just be lagged or dead
22:53 < jsn> indeed
22:53 < stepcut> jsn: the current happstack solution is to say, "wow, this distributed stuff is really tricky. Let's use this library written by smart people who read all those papers we didn't" ;)
22:53 < jsn> which is what?
22:53 < stepcut> jsn: we use the spread mode that ensures the message is not delivered to the client unless all nodes that are part of the group have received the message
22:54 < stepcut> jsn: not sure what spread does about lagged nodes, etc. That is something we are starting to document now
22:54 < jsn> wow
22:54 < jsn> okay
22:55 < jsn> well, it is surprising that you would accept such a solution without knowing how many lagged nodes it can handle, for example
22:55 < stepcut> jsn: really, it is an attempt to avoid 'not invented here syndrome'. There is also the problem that not all of the knowlegde from the original HAppS developers has been translated to the happstack team yet. So, we are having to learn things that maybe the original developers already knew.
22:55 < stepcut> jsn: the happs development process was fairly opaque, so, unfortunately, a lot of useful information was never recorded or shared
22:55 < jsn> is this the "log4perl" spreader
22:55 < jsn> ?
22:56 < Saizan> http://spread.org/
22:57 < stepcut> jsn: happstack is (unfortunately), riding on the belief that the original happs developers made smart choices when picking happs. Though, they do have a descent track record. The intergration with spread is actually very minimal. So, if it turns out to suck, we should be able to swap it out for something else.
22:58 < stepcut> s/when picking happs/when picking spread/
22:58 < jsn> well, i think there is a larger problem
22:58 < stepcut> jsn: oh?
22:58 < wchogg> stepcut : Do you have any ideas on what we can do to get a better feel for the usefulness of spread?  Scalability tests?
22:58 < jsn> ACIDity is not of universal value
22:58 < jsn> it is very expensive, comparatively
22:59 < stepcut> jsn: yes, happstack only has limited support for acidity on a distributed basis
22:59 < jsn> if you are storing photos, for example, transactions are lot of overhead
22:59 < jsn> but you still want a paradigm that ensures the photos are replicated
23:00 < stepcut> jsn: transactions are limited to a single component, and for multimaster, I believe the default is 'eventual consistency'
23:00 < jsn> what does it mean that transactions are limited to a single component?
23:00 < jsn> a component is...?
23:00 < stepcut> jsn: I think it is possible that immediately after an update, two queries to two different machines might return different results. One might return the old value, and one might return the new value. Though after a short period of time, they would both return the new value.
23:01 < Saizan> stepcut: only for different components though
23:01 < jsn> what is a component?
23:01 < jsn> like, a record?
23:01 < Saizan> a components is a data structure
23:01 < Saizan> more like a table
23:02 < Saizan> or group of them
23:02 < Saizan> but it's just any serializable haskell type that you declare so
23:04 < Saizan> stepcut: so we've a different group for each component on the spread network?
23:05 < stepcut> jsn: originally, happs had only one global persistent data store, which had the ACID properties. Components just allow you to have a bunch of independent persistent data stores in your program. updates and queries to different components are run in different threads and happen in parallel.
23:06 < stepcut> Saizan: not now, because we only have multimaster. But maybe when we introduce sharding.
23:06 < jsn> hmm
23:06 < jsn> okay, i don't know guys
23:06 < stepcut> Saizan: what were you referring to when you said 'different components' ?
23:07 < jsn> i am going to look over the spread papers
23:07 < stepcut> jsn: yeah, me too :)
23:07 < Saizan> stepcut: well, then we're total ordering all the transactions even if they are about different components
23:07 < jsn> the sad thing about distributed systems is the ideas are very simple but the explanation is always so vague and unconvincing
23:07 < stepcut> jsn: the multimaster stuff is very new, so maybe it doesn't really scale. That is something we plan to find out in the next few months.
23:07 < jsn> for example, here is proof for the broadcast system above:
23:08 < jsn> there is either a majority or not. if a node receives the majority in time, it will commit the same thing as the other nodes that commit. if it does not, it goes to sleep.
23:08 < jsn> however, i bet you that did not convince you :)
23:09 < stepcut> jsn: interesting...
23:09 < Saizan> it's quite convincing actually, but maybe because i've already seen proofs like that :)
23:10 < jsn> well, okay, good
23:10 < jsn> there is one caveat
23:10 < jsn> every node must have the same picture of the membership of the group
23:10 < jsn> otherwise, how does it know it did not get a majority vote?
23:11 < jsn> that is also why it has to go to sleep at the end of a round when it did not get enough votes
23:11 < stepcut> Saizan: yes... we maybe total ordering the *delivery* of all update events even if they are about different components. And maybe that is not a good thing. Once the event has been delivered to a happstack instance, the internal threading might affect the ordering of events targeting different components.
23:11 < jsn> because that means that if it was sent a group membership update and it didn't commit it, it is not able to accurately handle votes any longer
23:12 < stepcut> jsn: yeah... It would really like to hear your opinion of spread, and if you think it will work for us.
23:12 < Saizan> stepcut: ah, quite a waste
23:12 < jsn> well, actually, this relates to what i was saying earlier about branching and dynamo
23:12 < stepcut> Saizan: yeah, but we are really looking at the very first functioning version of multimaster, so maybe it's good that there are lots of easy improvements to make ;)
23:13 < Saizan> stepcut: i guess :)
23:13 < jsn> you can just throw all the messages into a bucket and order them later
23:13 < jsn> anyways, i will look over the spread papers
23:14 < stepcut> jsn: cool. I have a reasonable understanding of how happstack uses spread. But, spread is a complete black-box to me right now. I know what the two guarantees it provides are, but not idea how it does that :)
23:17 < stepcut> jsn: I expect that mulitmaster (with out sharding) should scale for a while. Even if it is only scales well to 8 machines, that is still better than 1. And, with sharding, you can split your data into, say, 8 shards, each with 8 replicators, and get upto 64 machines pretty fast. So, even if spread doesn't work that well for bigger numbers, it should be a good interim step.
23:18 < stepcut> jsn: according to what I read, spread does not intend to replace multicasting to millions of members, but might scale fine to thousands...
23:18 < jsn> millions of members is probably of little use to most people though
23:19 < jsn> even thousands is a stretch
23:20 < stepcut> jsn: i think facebook recently bought 50,000 new servers
23:20 < jsn> communication among shard members is definitely going to be more expensive than communication between shards, sure
23:20 < jsn> stepcut: interesting
23:21 < jsn> what we are going to discover in the next couple of years is that static analysis is awesome
23:21 < jsn> and it lets you scale to multiple cores really well
23:21 < jsn> and that you get a lot more out of your hardware if it isn't pickling stuff all the time
23:22 < jsn> social networks have a usage profile that doesn't really need transactions, anyways
23:22 < jsn> as they tend just to pack down a lot of data which is rarely changed
23:23 < jsn> however, what i was getting at before -- shard members need to communicate a lot, to maintain consensus. Shards, not so much.
23:24 < jsn> you want to have a bunch of nodes that sit on top of the shards to adjudicate changes that cross shards, though
23:26 < stepcut> jsn: well, the aim of happstack (or at least happs) is to scale big. So, a lot of the current design decisions have been based on what big sites like amazon, google, and facebook have said publicly
23:27 < jsn> ah
23:27 < jsn> well, all those sites have separated storage from their application servers
23:27 < stepcut> jsn: for example, they *all* say that transactions don't scale. eBay is famous for not using transactions at all on their DBs
23:27 < jsn> hehe
23:27 < jsn> well, you know amazon is still a heavy oracle user
23:28 < jsn> financial data is often handled transactionally even when many other things are not
23:28 < jsn> Korollary, in #haskell-blah, knows more about that
23:28 < stepcut> jsn: so, if happstack based on spread can't scale to 50,000 machines, that would be considered a problem
23:29 < jsn> yes, okay, i can see that
23:29 < jsn> because transactions don't scale, neither do joins
23:30 < jsn> what you get instead are denormalized datasets
23:30 < jsn> but i think more important than that -- no one expects to have a single language, ACID storage layer for a scalable application
23:31 < stepcut> jsn: well, maybe happstack give them something unexpected :)
23:31 < jsn> hmm
23:31 < jsn> it would be unwelcome
23:32 < jsn> it is important to be able to use ruby and python and c with the scalable data store
23:32 < stepcut> jsn: yes. I think the current trend is to provide your datastore as a web service though, that can be accessed from any language?
23:33 < jsn> or a network service, like dynamo or bigtable
23:33 < stepcut> jsn: that seems to be what amazon is doing with S3, etc
23:33 < jsn> which is to say, it is expected that all application nodes go through the network service
23:33 < jsn> be it bigtable, googlefs, hadoop or whatever
23:34 < stepcut> yeah
23:34 < stepcut> anyway, I have to go to bed, but it was good talking to you
23:34 < jsn> sure
23:34 < stepcut> I hope you'll stick around and share your knowledge
23:35 < jsn> oh, well
23:35 < jsn> maybe you could read a paper of mine?
23:35 < stepcut> jsn: what topic?
23:35 < jsn> http://github.com/jsnx/members-only/blob/master/notes/Consistent%20Logging%20Algorithm
23:35 < stepcut> jsn: I'll take a look tomorrow
23:35 < jsn> aye
23:36 < jsn> do let me know what you think -- it is so hard to find people to vet these things
23:37 < jsn> were HAppS to demonstrate a scalable approach to transactions -- one that allowed scalable transactions to happen but did not force you to pay for them otherwise -- it would indeed impress people
23:37 < jsn> however, we have to know a little more about transactions to really do that. i'm going to read through the spread stuff before i say any more
23:37 < stepcut> yeah
23:37 < jsn> g'night
23:37 < stepcut> night
23:46 < jsn> Saizan: The simple, short explanation of the FLP paper is: If you don't have timeouts and one process fails, you will wait forever. If you have timeouts without clock synchronization, you'll have unpredictable behaviour.
--- Log closed Fri Feb 27 00:00:27 2009