01:51:08 <stepcut> ok, I think the happstack website is a little better now, http://happstack.com/
01:51:18 <stepcut> still needs a real graphic designer though
01:51:29 <stepcut> but hopefully the content is easier to find
01:55:02 <mightybyte> MACID *distributed* persistent data storage layer?
01:55:46 <mightybyte> Have you been doing some serious work I don't know about?
01:57:01 <mightybyte> If so, I definitely want in.
02:00:05 <stepcut> mightybyte: it is distributed. doesn't have sharding yet though
02:00:22 <mightybyte> Ahhh
02:03:44 <stepcut> I should look into that
02:04:48 <sm> stepcut: bravo, much better
02:05:09 <sm> the code blocks or something are forcing the page to be a bit too wide
02:05:22 <stepcut> sm: where at ?
02:06:05 <sm> eg the front page gives me a horizontal scrollbar in a ~800px-wide window
02:06:08 <Gracenotes> hm, readability-wise, might need some left margins, at least the header and footer
02:06:16 <sm> 1200px I mean
02:06:24 <stepcut> sm: it is currently (unfortunately) a fixed width layout..
02:06:34 <stepcut> sm: perhaps it is too wide though..
02:06:42 <Gracenotes> in addition to making things less wide :) same deal with me on 1200px
02:07:19 <stepcut> ok, apparently my monitors are too wide :)
02:07:45 <stepcut> actually, I think the problem is that the font size is too big
02:07:49 <Gracenotes> I like it
02:08:02 <stepcut> how does it look if you shrink it one step?
02:08:09 <Gracenotes> hm, yeah, Ctrl+- looks fine too, though
02:08:30 <Gracenotes> though the tweet sidebar is on the verge of unreadable, and the timestamps within moreso
02:08:59 <sm> no no, full-size fonts are vital
02:09:16 <stepcut> k, I'll try a narrower column
02:09:17 <Gracenotes> also, perhaps move the API reference to the top of the docs page?
02:09:46 <Gracenotes> nitpicky, but since it's off-screen now
02:09:55 <stepcut> Gracenotes: yeah, I was thinking that too actually, when I went to find it
02:10:44 <sm> it's really a great improvement. Maybe add how to update this site info on Community or somewhere
02:11:00 <gcollins> congrats on the new site guys, looks tons better
02:11:02 <Gracenotes> oh, by the way, I implemented digest authentication (for hackage, actually)
02:11:55 <Gracenotes> that is, it compiles. if you guys could use it somehow, that's fine by me, though there's just the pesky thing of making sure it works... ;) next on the todo list
02:12:18 <stepcut> gcollins: thanks. Making some tweaks still.
02:12:39 <stepcut> Gracenotes: nice
02:12:50 <dcoutts> Gracenotes: oh cool
02:15:47 <Gracenotes> dcoutts: yep, compliant with the RFCs and whatnot. the thing is, it's incompatible with the htpassword format (the username + realm + password have to be hashed together at the beginning and stored)
02:16:01 <stepcut> Gracenotes: we would definitely like to try to get that in..
02:16:24 <dcoutts> Gracenotes: yes, htpasswd and htdigest are sadly different
02:18:40 <dcoutts> Gracenotes: I don't think there is any way to upgrade existing stored accounts automagically from using plain to digest
02:19:32 <Gracenotes> yeah. the auth module would have to demultiplex. but those who don't want to send their passwords over near-plaintext could switch.. eh
02:20:23 <stepkut> sm: is the width better now ?
02:21:46 <sm> stepkut: not too much.. on a mac with normal font size, I get the scroll bar below about 1120px
02:22:00 <dcoutts> Gracenotes: right, we could tell users to reset their passwords, and the new one will be stored for use with digest
02:22:14 <sm> just shrink your window, you should see it
02:22:29 <sm> you must have massive browser windows
02:22:42 <Gracenotes> though a bit too wide still for 1024x768, which I don't use
02:23:12 <stepkut> ACTION is at 1600x1200 
02:23:15 <dcoutts> Gracenotes: though come to think of it, I'm not quite sure if we can offer different methods to different users
02:23:32 <dcoutts> Gracenotes: since we do not know the user's username before they try to authenticate
02:23:38 <Gracenotes> though this demultiplexing is easier said than done... the username is needed first to determine which kind of challenge. yeah.
02:24:14 <dcoutts> mm, we might have to have a flag day conversion, that'd be kind of annoying
02:25:35 <Gracenotes> well, there could simply be a one-time thing that if a user is not flagged as using digests, they have to authenticate to basic once and then submit a change of password (which could be their old password)... fun stuff.
02:25:35 <stepkut> ACTION tries some css hacking
02:25:48 <Gracenotes> oh, and not strictly happstack-related, sorry -.-
02:25:49 <stepkut> ACTION decides to wait another day for that
02:26:38 <stepkut> I would like the happstack.com layout to be a little more fluid, but css sucks at that, so I need to be in a fresh mind before taking that on
02:34:16 <sm> without the widths on the main column div and .code divs, it seems to work quite well here in chrome
02:34:33 <sm> things just flow around the twitter box in a narrow window
02:34:56 <stepkut> sm: hmm, I'll look at that tomorrow..
02:35:19 <stepkut> I narrowed it a bit more for now, does it seem more reasonable ?
02:36:50 <sm> stepkut:  yes, that's good, similar to other fixed-width sites
02:37:27 <stepkut> spiffy
02:37:35 <sm> indeed!
02:38:05 <stepkut> anything else I should change?
02:38:22 <stepkut> I moved the API reference up top (not live yet)
02:38:57 <sm> instructions for updating the site ?
02:39:12 <stepkut> I put some minor details in the footer
02:39:29 <stepkut> I'll add a full section to the community page soon
02:39:32 <stepkut> (or you could!)
02:40:19 <sm> I probably would have, if these very same instructions had been visible! Should help would-be happs doc'ers
02:40:34 <stepkut> yep
02:40:48 <stepkut> I am going to rebuild the site and upload it, one moment
02:41:33 <stepkut> I should update the roadmap as well
02:41:49 <sm> I just lately made my sites update on darcs push.. loving it
02:41:59 <stepkut> sm: oh ?
02:42:11 <stepkut> sm: ah, I see..
02:42:20 <stepkut> sm: I actually build .debs and apt-get install them :-/
02:42:45 <sm> I have darcs posthooks now that run "make commithook", which I have do whatever
02:43:25 <stepkut> neat
02:43:48 <stepkut> I should start by making a script at least
02:43:58 <stepkut> there is no reason I couldn't automate this more
02:44:07 <sm> like hakyll build.. for a happstack server, I guess it could be a compile followed by supervisorctl restart ... (I run my servers with supervisord)
02:44:11 <stepkut> it's only three lines :)
02:44:22 <sm> I'm all into automation
02:44:36 <stepkut> yeah
02:44:43 <stepkut> why do what a computer can do for you
02:44:43 <sm> it often feels like yak shaving, but I'm seeing payoffs
02:45:02 <stepkut> I hope tibbe finishes yak shaving soon and finishes hyena :)
02:53:06 <stepkut> ok, updated!
02:53:13 <stepkut> you can tell cuz the bot went and came back :)
02:53:44 <sm> oh, the site is also a bot ?
02:53:48 <stepkut> yes
02:53:57 <stepkut> it just logs the channel, nothing else so far
02:54:10 <sm> that should make for some pretty compelling integration opportunities
02:54:12 <stepkut> I haven't figured out what cool thing we can do that involves the website and the bot yet ;)
02:54:20 <stepkut> it *seems* like a cool idea though
02:54:49 <sm> announcing tweets...
02:55:22 <stepkut> I think I need to modify the bot so that it actually has a timer for the PING/PONG stuff, and if it doesn't get pinged after some time, it should assume the connection is dead..
02:55:51 <stepkut> sm: except, the tweets are done via javascript , so the site doesn't actually know anything about them :-/
02:55:55 <sm> not sure what else, actually, but there must be something. Aside from the usual announcing of news feeds (commits etc.)
02:56:01 <sm> figured that
02:56:05 <stepkut> sm: but it could look at the real twitter API and figure that out
02:56:15 <stepkut> commits would be nice to annouce
02:56:23 <stepkut> patch-tag.com nedes an api for that
02:56:33 <sm> I do that with rss2irc bot
02:56:59 <sm> ACTION has found it tricky to write a really robust bot
02:57:27 <sm> it would be good to have more of that stuff in a library
02:57:49 <stepkut> sm: yeah
02:58:09 <stepkut> sm: there is a new irc bot library based on attoparse and bytestrings that looks promising
02:58:13 <sm> hah I know.. the bot could give real-time traffic reports
02:58:32 <sm> especially when something unusual happens
02:58:34 <stepkut> sm: but it does not have a good way to do channel logging, because there is no way to hook into the messages that the bot itself sends
02:58:37 <gcollins> i would think the most interesting thing would be to introspect on the state of the server
02:58:52 <gcollins> so you could ask the bot to "give me the last 5 error log entries"
02:59:11 <gcollins> and yeah, "current req/s rate"
03:00:05 <sm> synthea: site users ? requests per second ? uptime ? memory usage ?
03:00:26 <gcollins> that's pretty cool actually
03:00:56 <sm> yup, I don't know another integrated site/bot yet
03:01:47 <stepkut> right now our request rate is about 34 visistors/day I think :p
03:01:50 <sm> slightly tangential, but I think having built-in knowledge of web stats is also unusual and really useful.. http://squeak.org/stats.html
03:01:56 <gcollins> :)
03:02:04 <sm> (like)
03:05:52 <sm> alas, looks like I gave that server a seizure
03:06:46 <sm> or one of these pages is slow. But also: http://squeak.org/admin.html?view=serverStatistics  , a persistent record of uptime
03:07:00 <sm> introspection, like gcollins said
03:08:17 <stepkut> yeah, first happstack.com has to have something worth introspecting about I think
03:12:30 <stepkut> I asked thomas to add a commit callback feature to patch-tag, so we can at least annouce patches in the channel
03:12:52 <stepkut> on commit, patch-tag would POST some info to happstack.com. Then the bot could announce it..
03:19:28 <stepcut> ok, time for bed now!
03:39:21 <sm> good night
05:07:01 <dons> in the example minimal app on the happstack front page,
05:07:02 <dons>    "module Main where"
05:07:05 <dons> is unnecessary
06:20:21 <dancor> it appears that the happs withRequest request object doesn't correctly handle array inputs in rqInputs, and instead will only keep the first value of e.g. "a[]" instead of all of them.  is that true?
14:33:12 <stepcut> dons: but good style perhaps?
14:38:37 <stepcut> dancor: I think the Request object is fine, it is the lookInput functions that only find the first object
14:40:14 <stepcut> I have a new version of RqData that I will be adding in this release which address that issue, but you can just copy the functions you need for now
14:40:15 <stepcut> http://patch-tag.com/r/mae/happstack/snapshot/current/content/pretty/happstack-wai/Happstack/Server/RqData.hs
14:50:59 <stepcut> or ,you can just copy that whole file into your project
14:51:27 <stepcut> and import the RqData stuff from that file instead of SimpleHTTP
17:23:18 <dancor> stepcut: cool
17:24:51 <stepkut> :)
17:52:55 <Gracenotes> hm, does MACID do any swapping on its own, by the way?
17:53:23 <mightybyte> Not that I know of
17:53:54 <stepkut> Gracenotes: no
17:54:56 <stepkut> Gracenotes: that would lead to very unpredicatable access times
17:55:23 <Gracenotes> hm. so OS swapping is a bit more predictable
17:55:35 <stepkut> Gracenotes: no, OS swapping is even worse :)
17:55:55 <stepkut> Gracenotes: I have an idea for how to make a structure like IxSet that only stores the keys in memory, and the values on disk, and the values are returned via iteratees
17:56:38 <Gracenotes> oh, that sounds reasonably scalable :)
17:57:47 <stepkut> Gracenotes: perhaps :) storing things on disk at all is poor for scabality :)
17:58:14 <stepkut> Gracenotes: but there are some apps where it is ok
17:58:27 <Gracenotes> as a thought experiment, I'm thinking of ways to store a Wikipedia-size database
17:58:30 <stepkut> Gracenotes: perhaps you know that the data will be accessed infrequently.. or maybe the data is especially large
17:58:39 <stepkut> Gracenotes: how bit is the wikipedia database ?
17:59:38 <Gracenotes> well, 389,934,961 total edits, each several kilobytes to dozens of megabytes worth of markup and metadata
17:59:54 <Gracenotes> and hundreds of thousands of log events
18:00:12 <stepkut> according to this page, it was 1.2TB in 2006, http://en.wikipedia.org/wiki/Wikipedia:Technical_FAQ#How_big_is_the_database.3F
18:02:15 <Gracenotes> 20,460,408 total pages, hm, and the job queue currently has 330,269 items. thankfully, most of it is cached, but the most expensive times are when the PHP backend has to be invoked (to render a page or run a job), and when the object isn't in memcached, to do a database read, and most expensive of all, a database write
18:02:41 <Gracenotes> I'm sort of surprised the system works at all :P
18:02:51 <stepkut> in march 2010, the text of the english wikipedia is 5TB
18:05:02 <stepkut> so,  you *could* store all the text in RAM, but it would have to be split across around 100 servers
18:05:34 <stepkut> I think facebooks memcached farm is actually bigger than 5TB now.. they store something like 80% of their dataset in RAM
18:06:02 <stepkut> but for wikipedia, that is probably overkill, since most of the pages are infrequently accessed
18:06:43 <Gracenotes> oh, huh. does facebook use something like MPI to coordinate it all?
18:07:05 <Gracenotes> or just distributed memcached might work too..
18:08:18 <stepkut> for something like wikipedia, it seems like you could do something similar to what gitit does, and store most of the data on disk. You could even pre-render each page, store it on disk, and then serve it with sendfile maybe ?
18:08:34 <stepkut> your in memory database would be used for searching, etc ?
18:08:59 <Gracenotes> I think they Lucene, which requires indexing
18:09:10 <Gracenotes> as an exhaustive search would be crazy
18:09:37 <Gracenotes> wikipedia *does* keep its parse trees in memory, not the actual rendered html, since logged-in users might have different display preferences
18:09:44 <stepkut> ah
18:09:53 <Gracenotes> and parsing the markup language is rather expensive
18:10:26 <Gracenotes> they also use squid for non-logged-in users, bypassing database memory altogether
18:10:34 <stepkut> yeah
18:11:13 <Gracenotes> I've read about distributed macid, though not the particulars... hm. 8.8
18:11:32 <stepkut> So, I think the answer is: they need a very customized solution which keeps some things in memory and other things in disk in various formats
18:12:46 <stepkut> one of the ideas behind macid is that they should be able to build that solution easily
18:13:12 <stepkut> instead of having macid *guess* what should be on disk/ram, you control that explicitly
18:13:52 <stepkut> macid has support for distributing the database across multiple servers, but it does not yet have the ability to partition the data
18:14:00 <stepkut> so every node has a complete copy of the database
18:14:07 <stepkut> partitioning is next
18:14:31 <stepkut> and then some additional types and tools to help you work with systems where you want to migrate some of the data to disk for storage instead of RAM
18:14:57 <Gracenotes> possibly on-the-fly?
18:15:11 <stepkut> sure
18:15:56 <stepkut> with the disk based ixset thing I imagine, you get back an iteratee (or something) that gives you access to the actual values
18:16:12 <stepkut> but, some of those values could already be cached in RAM
18:16:36 <stepkut> you would supply a caching policy to specify when/how to cache things on RAM or flush them to disk
18:16:58 <stepkut> but we would have some standard policies to get your started. Like, cache the N most recently used objects
18:17:48 <Gracenotes> mm. and then having to worry about all kinds of fun issues like thrashing, etc.
18:18:05 <Gracenotes> which one should worry about in these cases anyway, of course :)
18:18:18 <stepkut> yes, avoid disk is best if you can
18:18:36 <stepkut> RAM is 100x faster these days, isn't it ?
18:19:38 <Gracenotes> yeah. that does sound doable... of course it's still hard to do stress-testing on the level of Wikipedia/Facebook/whatnot, let alone create a system that can take the stress
18:21:05 <stepkut> right. And the load on those systems changes too. So they have to keep readjusting things
18:21:17 <stepkut> that's why it seems to have a system that is very 'adjustable'
18:23:12 <stepkut> the way facebook works these days is that they have only two columns tables in their databases, and they don't use transactions or joins. Instead they load all the data into memcached and do all the queries/joins in PHP. The php code is much slower at doing the queries (of course), but they couldn't scale the mysql pool beyond 800 servers, but they could keep adding more php servers no problem..
18:23:39 <stepkut> they also have custom processes that run to migrate data out of mysql to even slower long term storage, to keep the mysql databases running smoothly
18:25:16 <stepkut> it's a mess :)
18:25:34 <stepkut> though, amazon is similar.. they also do not use transactions
18:26:45 <stepkut> here are two interesting readings
18:26:46 <stepkut> http://glinden.blogspot.com/2009/11/put-that-database-in-memory.html
18:26:52 <stepkut> http://highscalability.com/blog/2010/5/17/7-lessons-learned-while-building-reddit-to-270-million-page.html
18:27:26 <gcollins> joins kill, esp. in mysql
18:27:47 <stepkut> reddit also uses transactionless two-column tables and stores most everything in memcached I believ
18:28:29 <stepkut> seems better to just start with everything in RAM, and then migrate stuff to disk later if needed.. instead of trying to cache all your disk based stuff into ram
20:05:56 <bonobo> stepkut: everything in RAM is cool, but is IxSet the best way to do this?
20:06:13 <stepkut> bonobo: there is no requirement that you use IxSet
20:06:28 <bonobo> what else is there to use?
20:06:56 <stepkut> bonobo: anything you want, as long as it does not contain functions or existentials pretty much
20:08:20 <stepkut> that's one thing that makes MACID nice.. you can you just about any haskell data structure
20:08:50 <bonobo> stepkut: migrations in IxSet cannot create new tables
20:08:57 <bonobo> if you know what I mean :)
20:09:08 <bonobo> that needs to be solved somehow
20:09:22 <stepkut> what is the problem ?
20:11:25 <bonobo> if one data is embedded in other one, but in the next version should end up in separate component
20:12:51 <stepkut> ah
20:13:48 <stepkut> there are two issues here. One is splitting and ixset into two pieces, and another is migrating data between components
20:13:50 <stepkut> yes ?
21:08:18 <dcoutts> stepcut: btw, I've never quite groked the way data components depend on each other
21:08:38 <dcoutts> is it simply so the top level proxy knows which components to activate
21:09:20 <dcoutts> or does it let us do more interesting things, like construct an atomic query/update that covers a data component and another data component it depends on
21:19:44 <stepcut> dcoutts: right now it just a way so that the top level proxy knows with components to activate
21:20:28 <dcoutts> stepcut: ok
21:20:46 <dcoutts> stepcut: if that is all it is for, do you think it'd be possible to do using values, rather than using types
21:20:48 <dcoutts> it's hard to make lists of types at runtime
21:21:08 <dcoutts> in the hackage server we have a plugin-ish style, with a list of features
21:22:07 <dcoutts> each feature can have one or more data components
21:22:07 <dcoutts> while we can dynamically compose the ServerParts using msum, we cannot do anything equivalent for the data components
21:22:22 <dcoutts> a HackageFeature is simply a record, we start the server with a list of features
21:22:51 <stepcut> ah
21:23:50 <dcoutts> you see what I mean
21:24:07 <stepcut> yes
21:24:46 <dcoutts> we don't actually make it fully dynamic and pluggable, but the idea is to try and keep things well separated, to make adding or removing features much easier
21:25:26 <stepcut> yeah
21:25:38 <dcoutts> stepcut: of course the other way one might push it is to say that these dependencies could be used to let us do more
21:26:01 <dcoutts> like allowing one data component to build on another, and have atomic actions that span both components
21:26:34 <dcoutts> I can understand you might need a tight type restriction to make that possible
21:26:44 <stepcut> so, I am a bit fuzzy at the moment, but here is the undelying issue
21:27:17 <dcoutts> ACTION wonders if it's the unsafePerformIO global var hack
21:27:51 <stepcut> the components get serialized to a disk. each component is stored as a pair, (the name of the type, the value for that type)
21:28:21 <dcoutts> you're talking about the checkpoints
21:28:25 <stepcut> yes
21:28:27 <dcoutts> rather than the event logs
21:28:56 <stepcut> yes
21:29:23 <stepcut> to deserialize the data, it needs a list of all the component type names, and a function which can read a value back into that type
21:30:59 <stepcut> (this is where I can't quite put the pieces together for you)
21:31:05 <dcoutts> mm
21:31:14 <stepcut> in order to read the data back, you need to know at compile time, what types to expect
21:31:20 <dcoutts> right
21:31:31 <stepcut> and that is where the Component stuff comes into play
21:31:58 <stepcut> I will think about this and give you a clearer answer tomorrow
21:32:18 <dcoutts> I suspect it could be done using existential wrappers
21:32:50 <dcoutts> so you have a typed data component (the type param being the type of the Haskell data)
21:33:06 <dcoutts> that would want to be in Typeable so you get the name
21:33:32 <dcoutts> but the deserialisation code does not need to know the actual type, so long as it can check the names match up and run a deserialisation function
21:34:58 <dcoutts> stepcut: don't pay too much attention, I'm just thinking out loud
21:35:25 <stepcut> so, when you have these plugins, you still have all the code for each plugin compiled in, you just don't activate the server parts for them ?
21:35:47 <dcoutts> right
21:35:51 <dcoutts> though that's not the real purpose
21:36:03 <dcoutts> the main purpose is that we can keep the features separated
21:36:13 <stepcut> yeah
21:36:23 <dcoutts> so that it's obvious where to insert a new feature
21:36:33 <dcoutts> and easy enough to extract or replace them
21:36:57 <dcoutts> without getting a big soup of entangled code for the different features
21:37:13 <dcoutts> the ServerPart MonadPlus is great for that
21:37:42 <dcoutts> something like data DataComponent a = Typeable a => DataComponent (ByteString -> IO a)
21:38:19 <dcoutts> wrap :: DataComponent a -> (Gubbins -> ByteString -> IO ())
21:38:43 <stepcut> my parents are in town, so I need to got eat dinner with them, but I difinitely want to discuss this more
21:42:28 <dcoutts> point being, the actual doing code that receives the binary data, and stashes the deserialised Haskell data into an IORef (or the happs equivalent bit that holds all the data components at runtime)
21:42:29 <dcoutts> that bit, does not need to have the 'a' type exposed
21:42:29 <dcoutts> so at that point, we can make lists of them
21:42:31 <dcoutts> [wrap componentA, wrap componentB] etc
21:42:31 <dcoutts> with better names than wrap :-)
21:52:34 <stepcut> ok I thought of two things while in the shower
21:54:10 <stepcut> 1. when we fix state so that there is a version of update / query which take a macid handle instead of using the global IORef, that handle could have a phantom type representing the top-level Proxy type. Then when you used update /query, we could actually check out compile time if you are trying to use a component that is not actually loaded
21:54:17 <stepcut> which is something we don't do right now ;(
21:55:28 <stepcut> 2. I do not think there is any fundamental reason why we can not load additional components on the fly. Right now we load them all automatically when the state system starts
21:55:55 <stepcut> in theory you could load them later.. though it would be up to the developer to ensure that they actually loaded the component before trying to do any queries against it
21:56:05 <stepcut> and that could be trouble when trying to replay the event logs..
22:01:50 <stepcut> I guess I am still not clear on why the Dependency type is a problem though. If you all already importing a plugin and adding it's serverpart to a list of server parts, why is adding it's component to a component list an issue?
22:02:01 <stepcut> gotta run, bbl.
22:13:04 <dcoutts> stepcut: the only problem is that we have to do it in two places
22:13:18 <dcoutts> one in the list of features, and again in a list of state components
22:13:46 <dcoutts> it's kind of unsatisfactory that at the top level we have to know which features do and do not have state components
22:13:51 <dcoutts> it breaks abstraction
22:14:37 <dcoutts> it would be more pleasant if given the list of features, we could just extract the state components as values and pass them to the transaction system
22:15:39 <dcoutts> stepcut: I certainly approve of update / query taking the state component handle as a parameter, rather than relying on an evil global mutable var
22:17:23 <dcoutts> as you say it guarantees that you have the state component loaded
22:17:34 <dcoutts> since that's the only way to get the handle, is to load it
22:37:59 <bonobo> nice talk here
22:38:31 <bonobo> msum approach seems complementary to type level lists