--- Log opened Sat Aug 01 00:00:50 2009
03:36 < dons> Lemmih:
07:54 < Lemmih> dons: pign.
13:52 < Lemmih> dons: Around?
14:29 < mae> stepcut: data.text for text only? what about arbitrary binary data, like image/jpeg for instance
14:31 < stepcut> mae: um...
14:31 < stepcut> mae: i don't understand the question
14:49 < dons> Lemmih: ?
14:52 < Lemmih> dons: In what cases does 'replicateM n' blow the stack when 'getMany n' doesn't?
14:54 < dons> I think it was this program (in the testsuite)
14:54 < dons> import Data.Binary
14:54 < dons> main = do encodeFile "test.dat" ([1..10^7]::[Int]) v <- decodeFile "test.dat" :: IO [Int] print (last v) print "done"
14:54 < dons> uses the heap instead
14:54 < dons> there was a mailing list thread.
14:55 < Lemmih> Why can't it be lazy? It should just be a stream instead of a list. That is, accessing the tail should force the head.
14:58 < mae> stepcut: in memory, i thought data.text was for utf-8? what about storing binary data like a jpeg in memory? are you saying that text would be more efficient in this case also?
15:01 < mae> s/utf8/readable text
15:06 < stepcut> I suggested that you ought to conside using Data.Text instead of String if you want an efficient data type for storing Unicode strings in memory...
15:11 < mae> ahh ok
15:11 < stepcut> oh, I also said that if you have other data which is being stored using a non-space efficient data type, you could switch to a different more space efficient type
15:11 < mae> so bytestring is efficient in memory?
15:12 < stepcut> yeah
15:12 < stepcut> a bytestring that is 1000 bytes long probably takes up around 1012 or less bytes of memory
15:13 < stepcut> but, a bytestring is also very opaque...
15:14 < stepcut> you don't really know what is in the bytestring.. it could be a utf-8 encoded string, it could be a jpeg, etc.
15:14 < stepcut> and if it is a serialized Set, you won't be able to perform Set operations efficiently
15:14 < Lemmih> dons: on haskell-cafe@?
15:15 < stepcut> mae: but, you could design a CompactSet data-type which is more space efficient than Set by making certain trade-offs (you might have fast lookups, but expensive insert/delete/modify operations)
15:16 < mae> stepcut: so basically, use Data.Text and get the best of both worlds for text
15:16 < mae> but if its just arbitrary data then bytestring is fine (arbitrary like image/jpeg or an iso image for instance)
15:17 < stepcut> mae: sure.. though you might want to use newtypes so that you don't accidently pass an ISO to something that expects a JPEG
15:17 < mae> yeah ok, but in http thats very flexible :)
15:17 < stepcut> mae: but in those examples, you are using the ByteString sanely -- because you are treating the jpg or ISO as a sequence of bytes
15:17 < mae> esp if we allow the app programmer to set the content-type headers
15:18 < stepcut> mae: but that is not sane for text, if you plan to do things like find the number of characters, etc.
15:18 < mae> so these newtypes would be for the app programmer
15:18 < mae> imo
15:18 < mae> ic
15:19 < stepcut> do people really want to put ISO images in their in-memory state?
15:19 < mae> it was just an example :)
15:19 < mae> i am thinking from the perspective of if bytestring being the sole way to send your response body is "correct"
15:19 < mae> in happstack-server
15:20 < stepcut> eventually it has to be cast to a bytestring, because that is what the network library uses, yes ?
15:20 < mae> no
15:20 < mae> the network library uses a handle :\
15:20 < mae> so it doesn't matter
15:20 < mae> anything that can write to a handle
15:21 < mae> but yeah, bytestring does make sense as being the "one true type"
15:21 < stepcut> ah, then I would say that forcing it to be a ByteString is less than ideal
15:21 < stepcut> one moment
15:21 < mae> well
15:21 < mae> i want to rewrite the core code to use sockets
15:21 < mae> instead of handles
15:22 < mae> so the behavior is more strict
15:22 < mae> but yeah, Network lets you do a send to a socket
15:22 < mae> Network.ByteString lets you do a send to a socket
15:22 < stepcut> if there was an hPutText :: Handle -> Text -> IO (), then that we be nice, since I will be generating the html pages as values of type Text
15:22 < mae> the former takes a String to send the latter takes a ByteString
15:23 < mae> stepcut: what is the underlying implementation of text?
15:23 < mae> a bytestring?
15:23 < stepcut> no
15:23 < mae> custom?
15:23 < stepcut> Text is it's own underlying implementation. packed Word16.
15:23 < mae> k
15:24 < mae> well Text doesn't really make sense to send to a raw handle or network socket
15:24 < mae> since the output encoding is not defined
15:24 < mae> so semantically it makes sense to convert it into a bytestring with whatever encoding you want
15:24 < mae> and then send to the raw handle or socket
15:24 < mae> because handles and sockets just take bytes, and have no concept of encoding
15:25 < stepcut> yes
15:25 < stepcut> as I said, at some level it eventually is converted into a stream of bytes
15:25 < mae> so can't we use the Codec module (can't remember the name) to lazily convert a Text to a ByteString in lets say, utf-8?
15:26 < stepcut> Text provides it's own functions for converting to a utf-8 encoded ByteSTring (among other things)
15:27 < mae> yeah i noticed that
15:27 < mae> but that is inefficient in the sense that
15:27 < mae> we have to convert the whole darn thing before sending
15:27 < stepcut> Text provides both strict and lazy versions...
15:28 < stepcut> but a hPutText :: Handle -> Text -> IO (), might still be able to do better
15:28 < stepcut> and there is also the iterator option
15:29 < stepcut> e.g. hyena
15:29 < mae> but there is no output encoding defined
15:29 < mae> you need another argument
15:29 < mae> and if you do that
15:29 < stepcut> mae: oh right. hPutTextUT8, etc.
15:29 < mae> you need to also do send
15:29 < mae> and recv
15:30 < mae> so probably it would be best to create a datatype encoding (or reuse something like this that already exists on hackage)
15:30 < mae> a datatype for encoding *
15:30 < stepcut> I thought you said the network library used a Handle?
15:30 < mae> well
15:30 < mae> the high level library
15:30 < mae> Network
15:30 < mae> gives you a Handle
15:30 < mae> but its merely a Handle that was converted from a Socket
15:30 < mae> so depending on whether you use Network.accept or Network.Socket.Accept
15:31 < mae> you will get a Handle or a Socket
15:31 < mae> you lose alot of fine-grained control with a Handle
15:31 < stepcut> sendTextUTF8 :: Socket -> Text -> IO () ?
15:31 < mae> yep
15:31 < mae> and then hPutUTF8
15:31 < mae> you also need recv and hGet
15:31 < stepcut> sure
15:31 < mae> (recv for Socket and hGet for Handle)
15:35 < mae> so i've been talking to tibbe, and he thinks that a Handle is convenient but not typesafe for handling network stuff
15:35 < mae> like, some stuff only sends to a network socket, and not a Handle
15:35 < stepcut> yeah
15:35 < mae> sooo we don't like that the high level implementation forces you to use a Handle
15:37 < mae> stepcut: sendfile-0.5 is out
15:37 < mae> http://hackage.haskell.org/package/sendfile-0.5
15:37 < mae> good cpu efficiency, + >2gb files
15:39 < stepcut> mae: does it fix the bug with happstack + sendfile 0.3.1 were you get partial file downloads?
15:40 < mae> no happstack hasn't been touched yet
15:40 < mae> but yeah the problem was in the sendfile library, not with happstack
15:40 < mae> so it is fixed, but not in happstack yet
15:41 < mae> I will get to that when I can
15:43 < stepcut> ok
15:44 < stepcut> I just reverted to serveFileLazy now (or whatever it is called)
15:44 < mae> heh
15:44 < mae> so your using the dev code eh?
15:44 < mae> brave
15:44 < mae> everything should work fine, but still, brave.
15:44 < stepcut> what could possible go wrong
15:45 < mae> yeah, exactly, what could possibly go wrong?
15:45  * stepcut figures any version is brave ;)
15:45 < mae> well, any version that you haven't used for a long time that is
15:45 < mae> : )
15:45 < mae> i think i'll get back into hacking on the http core again
15:46  * stepcut hopes that mean finishing sendfile support ;)
15:46 < mae> I am still unhappy that ghc is limited to 1024 open file descriptors, but oh well
15:46 < mae> I was thinking that you could use varnish to accelerate happstack
15:46 < stepcut> mae: isn't that an OS thing?
15:46 < mae> stepcut: yes and no, i mean, ghc uses select which is a well defined posix thing
15:46 < mae> so its portable
15:47 < mae> but every major os has its own version of what is called "event based" fd handling, which allows you to have as many fd's as you want (no arbitrary limit) allowing you to scale to many many concurrent connections
15:47 < mae> problem is, similar to sendfile, this feature is nonstandardized and different on each os
15:48 < mae> on linux we have epoll, on bsd we have kqueue, on windows we have "completion ports"
15:48 < mae> select only allows you to monitor 1024 file descriptors in high concurrency
15:48 < mae> (sockets are fds at the low level, so this limit is your open file handles + sockets combined basically)
15:49 < stepcut> mae: I am thinking that using varnish is anti-HAppS ;)
15:49 < mae> stepcut: yeah i know, but I want to spend some time in app building mode and not in "theoretical unscalability and ideal mode" for awhile hehe
15:50 < mae> besides, tibbe is working on a library to replace the io manager in ghc with these event based ones
15:50 < mae> i will try to help, but I want to get back into happstack-server
15:50 < mae> and theres only so much time :)
15:50 < mae> i'm hoping that alex & co will do some good work on state
15:52 < stepcut> mae: well, at lesat you have some apps with enough users that you feel a need for varnish ;)
15:55 < mae> stepcut: sure :)
15:56 < mae> i see it more as polishing happs since its still rough
15:56 < mae> : )
15:57 < mae> stepcut: we may have alot of users, I guess, (maybe 100 active), but they aren't being converted into $$ yet
15:57 < mae> but i'm not giving up yet :)
15:57 < mae> thomas just got a new job i think
15:57 < mae> doing stuff with amazon ec2
15:58 < mae> from varnish website: We use Varnish in the 3rd meaning of the word. Who are we trying to kid? We are using it to make a bad backend (origin server) look good.
15:58 < mae> lol
15:59 < mae> ideally it would be neat if happstack-server was just so badass that you didn't need varnish
15:59 < mae> we will be there eventually :)
15:59 < mae> i like a small stack, KISS
16:00 < mae> the neat thing though is that varnish can be used as a load balancer too
16:00 < mae> so think in the future, multimaster
16:00 < mae> err
16:00 < mae> you want to scale it
16:00 < mae> your round robin A record for www.mysite.com is
16:00 < mae> and
16:00 < mae> these ips point to two varnish boxes
16:01 < mae> each varnish box has 2-3 happstack servers behind it (multimaster)
16:02 < mae> you can use a service like nettica to do managed dns, if one of these ips goes down, within a few minutes it will change the dns records so that the non-responsive server will be taken out of the A record
16:02 < mae> or something like that :)
16:02 < mae> not the most perfect fault-tolerance but not bad
16:03 < mae> if you want something truly badass, you can put your happstack multimaster nodes on ipv6 addresses, and take advantage of anycast
16:03 < mae> anycast, if your not familiar with it
16:03 < mae> abstracts fault tolerance to the ip level
16:03 < mae> they use it for some of the root dns servers
16:03 < mae> so any of several nodes can respond to a request on a particular ip address
21:10 < mae_phone> ahoy
21:11 < mae_phone> I'm drinking beers at my brother in laws house, its his birthday.
21:15 < stepcut> mae_phone: I am drinking club soda and reading a book...
21:16 < mae_phone> club soda... And ???
21:17 < mae_phone> stepcut, you are building a webapp right? How's that going? Are you doing that full time?
22:05 < stepcut> club soda and a book
--- Log closed Sun Aug 02 00:00:52 2009