03:43:57 <AL^3QRAB> .Im   UsinG. H4cKeRzE
10:03:09 <cheater> hi guys
10:04:01 <cheater> i am considering using happstack as a server for a blog server / community, and was wondering how best to go about this
10:06:12 <cheater> there are some things that are not clear to me: how can i archive out things from the persistence layer? for example i wouldn't want to keep older / less frequently accessed blog posts in my ram if i'm running out of memory
10:07:04 <cheater> and, are the entries in the persistence layer mutable? can i use it for a user/pass list for example?
10:42:03 <cheater> hi rdtsc, do you use happstack?
10:42:17 <rdtsc> cheater: hi, I do
10:42:34 <cheater> want to answer some questions from a new user? :-)
10:43:20 <rdtsc> cheater: I am a new user too, but I'll try =)
10:43:27 <cheater> :)
10:43:38 <cheater> i just posted some Q's before you joined in, can I paste them in msg?
10:44:02 <rdtsc> sure
10:45:10 <cheater> done
10:49:11 <rdtsc> cheater: All entries in macid are mutable as in all State monads. As for first question - I don't now
10:50:14 <cheater> gotcha
13:28:38 <cheater> how does happstack's persistence layer share across separate servers?
14:01:21 <rdtsc> http://tutorial.happstack.com/tutorial/multimaster
14:02:24 <burp> <Lemmih> The multimaster code is no more.
14:03:27 <cheater> multimaster code is no more? huh? :)
14:12:06 <cheater> rdtsc: nice link, thanks
14:12:25 <cheater> i am reading that tutorial but didnt get that far yet
14:19:16 <stepcut> cheater: the persistance layer is mutable
14:20:03 <cheater> cool stepcut
14:20:10 <cheater> what about putting it on the disk?
14:20:22 <cheater> and/or some central specialized storage?
14:20:44 <stepcut> cheater: there is currently no explicit support for archiving things to disk.
14:20:47 <cheater> is the physical behavior of the persistence layer changable?
14:20:50 <cheater> hm
14:21:01 <cheater> that's not good, is it?
14:21:04 <stepcut> for contralized storage there is experimental support for amazon web services now
14:21:15 <stepcut> physical behavior?
14:21:39 <cheater> as in, how and where stuff is stored
14:22:19 <cheater> whether it gets stored in memory, on the local disk, on some central server on the disk, on some specialized server, ...
14:23:04 <stepcut> depends on what you mean by 'stored'. The data is stored in memory, but the events and checkpoints are also written to a log so you can recover from a crash/restart/etc.
14:23:41 <stepcut> the support for writing logs uses 'plugins' of sort that allow you to change how/where things are stored
14:23:48 <cheater> but there's only so much memory you have.
14:23:59 <stepcut> yes
14:24:06 <cheater> hence, not everything will fit.
14:24:28 <stepcut> only if *everything* is more than the amount of RAM you have..
14:24:35 <cheater> it is.
14:24:53 <stepcut> most servers will take 32GB of RAM these days.. which is more than what a lot of people have for data sets :)
14:25:20 <stepcut> the long term plan is to support sharding so that the dataset can be split across multiple servers
14:25:21 <cheater> why on earth would i store an item in ram that is only accessed one time per month?
14:26:08 <cheater> i do not see this as an efficient way of using hardware or money
14:26:20 <cheater> i'm sure you know what i mean :)
14:26:28 <stepcut> cheater: because it is perhaps cheaper to buy extra RAM than to manage the complexity of figuring out when an item is going to be accessed during the month and making sure it can be readily accessed?
14:26:42 <cheater> it's very simple for me
14:26:48 <stepcut> cheater: well, keep in mind that over 80% of facebooks working data set is stored in RAM :)
14:27:42 <cheater> facebook is not very well known for good decisions
14:27:58 <stepcut> oh?
14:28:08 <stepcut> they seem to be doing ok..
14:28:53 <stepcut> anyway, you can store things on disk, there is just no mechanism to do it automatically
14:29:12 <stepcut> http://glinden.blogspot.com/2009/11/put-that-database-in-memory.html
14:29:12 <cheater> is there a way to hook into this somewhere so that i could do it in a clean manner?
14:30:13 <stepcut> I think the first step is to become clear on what 'it' is
14:30:25 <stepcut> how do you decide when to store things on disk?
14:31:15 <cheater> to simplify, when there are 11 or more posts on someone's page, only the first 10 are shown, and the rest is not, and probably will not be, so we put that stuff on disk.
14:31:59 <cheater> 'They go on to argue that a system designed around in-memory storage with disk just used for archival purposes would be much simpler, more efficient, and faster.' <<< they're just reiterating what i said :-)
14:32:16 <stepcut> yes
14:32:36 <stepcut> but I think that archival purposes is very application specific
14:32:52 <rdtsc> cheater: use rdbms
14:32:53 <stepcut> in your case, you want to archive all but the 10 most recent blog posts to disk, right?
14:33:37 <cheater> yes
14:33:49 <cheater> but still access the ones on the disk the same way as the ones in memory - just slower
14:35:38 <stepcut> so, lets say you want to search for all the posts that occurred in 2010.. that means you still have to have some of the information from those posts store in RAM right?
14:36:26 <cheater> i won't want to do that
14:36:43 <stepcut> oh ?
14:36:49 <cheater> yes
14:37:11 <cheater> if i do, i will implement it, but i do not see this as a requirement in the foreseeable future
14:38:11 <stepcut> i see
14:38:36 <cheater> hell.. i don't think facebook allows that, either.
14:38:51 <rdtsc> how can macid handle such things http://www.big-boards.com/  ? =)
14:41:10 <stepcut> cheater: perhaps you should just immediately save the posts to disk. Your state will just be a big list of the paths to all the posts. That is all that will be stored in the persistance layer.  If you want the 10 most recent posts cached in RAM, it would be trivial to implement that with a simple IO thread and a cache. But that is just a cache of things that are on disk.. it does not need to be stored in the persistence layer
14:41:34 <stepcut> in other words, treat the blog posts in much the same way that you would image files
14:41:51 <cheater> stepcut: that's not the right way to do this.
14:41:56 <stepcut> cheater: why not?
14:42:05 <cheater> i can store a LOT of blog posts in ram
14:42:07 <cheater> but, not all of them
14:42:21 <stepcut> cheater: so, make your cache as big as you want?
14:42:46 <cheater> stepcut: i do not understand what you mean there
14:42:47 <stepcut> cheater: make it hold 1000 posts if you want.. and when the cache gets full it ejects the least recently accessed post
14:42:56 <cheater> stepcut: no, that's a backwards approach
14:43:01 <cheater> it should be ram-> disk
14:43:09 <cheater> not ram->disk->ram
14:43:39 <cheater> there's no point in putting data through the hdd before it becomes accessible
14:45:03 <stepcut> so yu want a indexable set that is stored in RAM, but some of the elements of that set can also be stored on disk ?
14:46:45 <cheater> yes.
14:47:12 <stepcut> and what happens when someone does, elem `isElem` mySet
14:47:42 <stepcut> is it going to have to temporarily load each saved item from disk to see if the elements are equal?
14:50:17 <cheater> i am not sure why i would like to do this
14:50:25 <cheater> maybe i don't want this sort of functionality
14:50:28 <stepcut> the system you seem to be describing, ram -> disk -> ram, seems very much like the way RDBMS systems work
14:50:50 <cheater> i have described ram->disk
14:51:03 <stepcut> sorry
14:51:34 <stepcut> I am still unclear how your approach would actually work
14:51:34 <cheater> or rather, ram->ram->disk where the data looks like this: ram(inside our app, not accessible generally) -> ram (accessible in the store) -> disk (accessible in the store)
14:52:02 <cheater> whereas you have described ram (inside our app, not accessible generally) -> disk (accessible in the store) -> ram (accessible in the store)
14:52:34 <cheater> the problem is that before the item i need high performance on, i.e. the first blog post, is accessible in the ram, it has to go into the hdd, and out of it, and that's a huge overhead
14:53:14 <cheater> and most likely, before the ram cache propagates, thousands/millions/billions of views will hammer the disk first
14:53:50 <stepcut> well, if you want durability, the data has to be logged to the disk first anyway, right ?
14:56:49 <cheater> no, durability can be ensured otherwise
14:57:35 <cheater> durability has not got much to do with the medium data is on
14:58:12 <stepcut> so, in your application code, when you write the code that retrieves and displays a blog post, are you going to have to distinguish between get a post that is in RAM, and one that is archived to disk?
14:58:27 <Muad_Dibber_> maybe I'm completly wrong and off topic here as a reall happstack newby, but what seems the point here is that you let macid do the work with sessions and such for you, and manually implement storing and loading blog posts?
14:58:34 <cheater> stepcut: no, i would like happstack to do it
14:58:51 <stepcut> cheater: and how will happstack know what data it ought to archive to disk?
15:00:19 <cheater> stepcut: via some code i will write that will provide a metric that explicitly decides one option or the other. the action of putting stuff on the disk should be code i write but 'in happstack'. the retrieval mechanism could also be code i write, but it will be 'in happstack'.
15:00:45 <cheater> i.e. if the metric is provided, but the disk io code is not there, it will be all in memory and it will work just normally
15:00:54 <stepcut> well, being 'in happstack' doesn't mean much
15:01:01 <cheater> whereas if the disk io is provided and the metric is not, then the same thing will happen
15:01:34 <cheater> stepcut: it means a lot, because 1. then this sort of approach will be reusable for other people who want to do it 2. i will be able to switch to a different implementation because my application is separated from the way happstack works 'internally'
15:02:08 <cheater> i.e. maybe in the future i would like to store the data in a NAS rather than on local disk, or something equally weird, who knows
15:02:39 <cheater> i wouldn't like my architecture options to be limited by a badly written application, so i would like to make this invisible behind the happstack api
15:06:38 <stepcut> well, the happstack-state just stores a single haskell value. Not every a list of haskell values, or a table, or anything. Just a single haskell value. Fortunately, that value can be any type which is it possible to serialize (so, it can't be a type that contains functions, etc).
15:07:54 <stepcut> so, the happstack-state mechanism itself can not really know how to arbitrarily serialize some parts of some arbitrary value to disk. But, you can use the state mechanism in combination with some a specialized type and some helper functions to store some data on disk and some in ram
15:08:21 <stepcut> so it is certainly quite possible to build a blog post storage mechanism that behaves the way you want
15:08:37 <stepcut> and it could be implemented in a fairly general way so that you could use it for other things as well
15:10:26 <stepcut> for something like a blog post, there is often a bunch of meta data associated with it.. when was it posted, who is the poster, how many times has it been viewed, etc. One thing to decide is if you want to keep that data in RAM, so that you can quickly and easily search it, etc.
15:12:18 <stepcut> if you look at the scaling problems that companies like facebook, amazon, etc, have had, it seems like a lot of there difficulties have come from not really knowing if the things they want to work with are really in RAM or not.
15:12:23 <cheater> you can do granular archival control by splitting into separate lists: one list for the meta data, one for the content.
15:12:38 <cheater> the content list would go to disk fairly quickly, the meta data one could happen much later.
15:13:23 <cheater> their problems come from not being able to scale RDBMS storage
15:13:34 <stepcut> my reading of, 'They go on to argue that a system designed around in-memory storage with disk just used for archival purposes would be much simpler, more efficient, and faster.', is that 'archival' data should be a distinct barrier. Not something that is transparently accessed...
15:13:52 <cheater> yes
15:13:56 <cheater> well
15:14:15 <cheater> yeah. for one thing, with modern web apps you can have lazy search results
15:14:40 <cheater> i.e. 'slow media' report search results later, and those get loaded in via ajax, or via pagination
15:14:54 <stepcut> sure
15:16:28 <stepcut> archiving things to the local machine is also, perhaps, not a very good idea
15:16:45 <stepcut> let's say your site is so popular you need 5 servers to host it
15:16:49 <cheater> i don't see why not but i don't see why yes
15:16:57 <cheater> so i am partial to agreeing
15:17:30 <stepcut> do you want the archived data on all the machines? Perhaps the archived data should only reside on one machine that acts as a web service provider
15:18:24 <stepcut> certainly much slower, but it is centralized, seldom accessed, and perhaps that is better than all 5 of the other servers having their own copies of the old archived data?
15:19:12 <stepcut> the servers could just have a local blog post cache
15:19:51 <stepcut> so if an old articular suddenly got popular, it would automatically end up in the RAM cache on the main servers
15:21:07 <stepcut> if you use centralized logging for durability, then your front end servers would not have any data stored on them .. they may not even have disks at all
15:21:28 <stepcut> and if your site gets a ton of traffic you could just fire up some extra servers?
15:30:13 <cheater> i think i would go for sharding with borders shared across nodes
15:30:20 <cheater> and the shard would contain its own archive
15:30:32 <cheater> i can do the sharding in my application, that's fine with me
15:30:54 <stepcut> yes
15:31:56 <stepcut> everything is ultimately, 'in your application'. Right now you would have to write your own sharding library, someday we will actually provide one :)
15:35:26 <cheater> i see happstack as external to my application
15:35:38 <cheater> i.e. if i want to change something in happstack, i probably want everyone to be able to use it
15:36:45 <stepcut> sure
15:36:49 <cheater> btw, stepcut: how does happstack work on windows? i would like to check it out on my home pc, i don't have a linux box that i could dedicate to it right now
15:38:40 <stepcut> cheater: windows support is considered essential. 0.4.1 was successfully built on windows, and no one has reported any windows bugs yet
15:38:52 <cheater> great
15:39:22 <stepcut> if you use cabal install you may need to install happstack-data with the -O0 flag if you get linker errors against syb-with-class
15:39:31 <stepcut> that is a cabal bug though
15:39:37 <cheater> happstack looks really interesting for me. i'm glad i decided to learn haskell. i'm still reading through the first tutorials, but it looks easy enough.
15:39:51 <cheater> ok
15:40:05 <stepcut> happstack is neat stuff
15:40:20 <cheater> =)
15:40:39 <stepcut> though pinning down what exactly is happstack and what is not can be a bit tricky :)
15:41:04 <cheater> so happstack has its own web server. do haskell/happstack contain some useful tools for my every day web needs, like say, templating, views, or whatnot?
15:41:18 <stepcut> for example, happstack-ixset is just a Set with multiple indexes. It is not really happstack specific in anyway... it's just very useful in happstack applications
15:41:54 <stepcut> happstack itself does not contain any templating. But it integrates nicely with Text.XHtml, HStringTemplate, and HSP
15:42:29 <stepcut> I prefer HSP myself, but not everyone does.. in happstack/happstack/templates/project there is an example of using HSP
15:43:50 <cheater> i'm a bit anxious to get going with haskell
15:43:56 <stepcut> :)
15:43:58 <cheater> i'll need to keep reading the tutorials
15:44:13 <cheater> i'm reading 'learn you a haskell'
15:44:19 <cheater> but it seems humongous
15:44:22 <stepcut> I heard that one is good
15:44:28 <cheater> it's jolly
15:44:33 <cheater> which makes it a breeze to read
15:44:37 <stepcut> :)
15:44:56 <cheater> in comparison reading the mysql manual is the most terrible experience ever
15:45:10 <cheater> it's the most inaccessible, dry text i have ever read
15:46:08 <stepcut> :)
15:46:35 <cheater> stepcut, do you know some big projects that have used happstack?
15:46:43 <cheater> big as in, high concurrency, lots of visitors, etc
15:47:31 <stepcut> cheater: no
15:48:10 <cheater> ow
15:48:24 <cheater> it would be nice to be able to refer to success stories, see how people did thigns
15:48:50 <stepcut> yes it would :)
15:49:39 <stepcut> I think people are having more trouble with the 'getting lots of visitors' part than the handling them part ;)
15:50:20 <stepcut> even if you get 1000 unique page views per day.. that is hardly high concurrency
15:50:32 <cheater> yes
15:51:20 <stepcut> even 100,000 hits per day is barely more than 1 per second
15:52:28 <cheater> yes
15:53:53 <stepcut> maybe happstack.com should build a dating site
15:54:07 <stepcut> to demo the technology, and to generate some revenue to fund development ;)
15:54:58 <cheater> lol
16:30:11 <rdtsc> stepcut: if you want to demo the technology, first of all you should fix package in hackage =)
16:30:45 <burp> dating site?
16:30:47 <burp> better a porn site
16:31:11 <rdtsc> ☺
16:32:34 <stepcut> burp: a dating site would probably pay better
16:34:09 <stepcut> burp: every aspect of running a porn site sucks.. the legal aspects, the billing aspects, the fraud aspects, getting content, getting traffic.
16:35:00 <stepcut> and seems less interesting from a technology showcase.. mostly you just need raw bandwidth and sendfile()
16:36:32 <stepcut> with a dating site you get user generated content, you can avoid billing all together, more opportunities to show off technology, integration with asterisk, xmpp, etc.
16:36:50 <stepcut> still have to get traffic and deal with spammers though
16:37:15 <cheater> what about user generated content on porn sites
16:37:17 <cheater> :-)
16:37:28 <stepcut> cheater: USC 2257a
16:37:50 <cheater> what's that now :-)
16:38:07 <cheater> ahh, record keeping
16:38:16 <cheater> well, nobody forces you to store this in america
16:38:19 <stepcut> cheater: you have to keep records on file that the FBI can investigate any time they want with out prior notice for every photo on the site
16:38:34 <cheater> you can keep that in germany
16:38:42 <cheater> or denemark
16:38:46 <cheater> or where ever :-)
16:39:28 <cheater> stepcut, do you know anything about function-level programming?
16:40:29 <stepcut> cheater: the problem is you have to collect them.. which is difficult for user generated content
16:40:37 <stepcut> function-level programming?
16:40:41 <cheater> yeah
16:40:48 <stepcut> I am not sure what you mean by that..
16:41:02 <cheater> http://en.wikipedia.org/wiki/Function-level_programming
16:42:30 <stepcut> no
16:42:38 <stepcut> aside from what I just read..
16:45:38 <cheater> i'm wondering if this is similar to writing fully lambda lifted programs
16:48:14 <Muad_Dibber_> cheater, did you try reading http://book.realworldhaskell.org (saw you talking about tutorials earlier on)
16:50:45 <cheater> Muad_Dibber_: nope!
16:50:53 <cheater> Muad_Dibber_: i'll check it out, thanks a lot
16:55:11 <Muad_Dibber_> and probably I'm just old news again, but I noticed blog.happstack.com isn't working :-)
16:56:57 <stepcut> Muad_Dibber_: yeah, mae decided not to pay to renew the vanity name for the livejournal account associated with blog.happstack.com
16:57:09 <stepcut> but we didn't come up with a plan as to what to do instead
16:57:14 <stepcut> I guess I could remove the link for now :p
16:57:38 <Muad_Dibber_> yeah dead links like that look so abandonned :-P
16:59:25 <rdtsc> maybe you should build blog about happstack using happstack?
17:00:32 <rdtsc> or take existent one
17:00:45 <stepcut> wordpress rather
17:03:14 <happstackcombot> 02 Feb 17:01 - update link to happstack blog (Jeremy Shaw)
17:20:30 <Muad_Dibber_> Thanks for the link to the guestbook project stepcut, its quite clarifying :-)
17:21:29 <stepcut> no problem
17:23:00 <stepcut> ok, the link is fixed now :)
17:26:53 <Muad_Dibber_> nice
18:22:58 <cheater2> hi again
18:23:47 <mightybyte> hello