irclog2html for #uclibc on 20031229

04:21.40*** join/#uclibc sjhill (~sjhill@12-217-217-31.client.mchsi.com)
04:47.13*** join/#uclibc landley (~landley@cs6625195-177.austin.rr.com)
04:47.29landleySo I'm messing with bzip, and getting strange results.
04:47.59landleyMy code generates valid bzip files, but the sizes differ from the archive sizes produced by the original version.
04:48.16landleyUsually mine are bigger, but I have at least one test case where it's smaller.
04:48.53landleyThe reason for that is the huffman code generation's ordering of tied entries (ones with the same weight).
04:49.42landleyThe other really oddball one is I made my buffer size bigger (used the full 900k instead of 900k-18 bytes), and the end result was a _larger_ archive.
04:49.47landleyDarn butterfly effects...
04:50.12landleyI suppose this is what I get for using random data as my test cases. :)
04:55.12mjn3i haven't looked at huffman coding in a while.  but if the weights are tied, why should there be a difference in size at that stage?
04:56.15landleyThere isn't at that stage.  But the resulting files made from things done in jseward's order are smaller in every test case I have except one.
04:56.33landleyIf they were _uniformly_ smaller, I'd just switch over to his and assume he knew something I didn't.
04:56.42landleyIt's that one outlier that's got me scratching my head...
04:57.06landley(Both sets of code produce valid huffman trees, and both extract to give you your original data back just fine...)
04:57.15landleyI have a _working_ bzip implementation, I'm just not happy with it yet...
04:59.04mjn3even if you switch back to his buffer size (-18)
04:59.53landleyAh, you've looked at the code then. :)
05:00.07landleyThe buffer size is seperate from the different huffman tables.
05:00.13mjn3no... just noticed your comment earlier
05:00.37landleyIn a separate test I removed the -18 to see what it would do, and the result was a 1% larger archive.
05:01.35landleyNot a clue why.  I need a better range of test cases... :)
06:26.02*** join/#uclibc landley (~landley@cs6625195-177.austin.rr.com)
06:26.14landleymjn3: You still awake?
06:26.18mjn3yep
06:27.05landleyI was wondering if it would be okay to shorten your hospice notice in the bzip code slightly.  (I have a suggested new wording, lemme cut and paste...)
06:27.09landley<PROTECTED>
06:27.09landley<PROTECTED>
06:27.09landley<PROTECTED>
06:27.09landley<PROTECTED>
06:27.31landleyConsidering that their address and phone number and such are on their website, you see...
06:27.41landley("No" is a perfectly acceptable answer here...)
06:28.48landleyIt's not a pressing issue, I just thought I'd ask...
06:30.04landleyHa!  My code now compresses the linux-2.6.0 tarball to be 250 bytes shorter than jseward's code!
06:30.11landley(And I have almost no idea why...)
06:30.30mjn3i suppose it would be okay to omit the address, phone number, etc
06:30.43mjn3as you said, it is available on the web
06:30.49landleyThanks.
06:31.01landleyI was thinking about it because I'm about to contribute the compression-side code.
06:31.22landley(Also, I just looked up the info because in lieu of sending you a christmas present I thought I'd just send the hospice a check.)
06:31.34landley(But I compared the info against the web page out of habit to make sure it was up to date...)
06:31.53mjn3that would be great.  thanks.  they really were wonderful
06:32.20landleyI know.  My mother died of cancer just over a year ago, and a hospice gave us a live-in nurse for the last couple months...
06:33.20landley(And all the pain medication she wanted, which admittedly wasn't enough.  She had chicken pox as a child.  As her immune system went downhill, it came back as shingles.  Just to add insult to injury, really...)
06:33.37landleyAh, christmas cheer.  Right, back to data compression code...
06:34.35mjn3i really should add a note in all the other applets i've written
06:34.44mjn3prior to the 1.0 release
06:35.47landleyI'd bounce an email off andersee.  Maybe he can get you a HOSPICE.TXT file or something like that in the root dir...
06:37.06mjn3well, i've been needing to put a dedication in uClibc.  i suppose i should talk to him about busybox
06:37.28landleyThey're BOTH kinda close to a 1.0 release, aren't they?
06:38.15landleyOne reason I'm thumping on my home-grown linux from scratch system again is to have an environment to test out uclibc and busybox in "as part of this balanced breakfast", as it were...
06:38.38landleyI'm focusing more on busybox.  When it breaks, I can fix that.  When uclibc breaks, I'm pretty much lost...
06:38.44mjn3busybox is.  i think there's more work to do for uClibc.  because of delays, i haven't added the gettext stuff yet.  plus i need to implement wcsftime and stabalize the internal locale api
06:39.05landleyI switch locale and other foriegn language support off, myself.
06:39.19landleyNo, the stuff that gets me is all the patches needed to make c++ work...
06:39.54landleyLinux From Scratch 5.0 redid the entire stage one build process so that you can swap out the C library.  (They manually edit the paths for binutils and gcc and such).
06:40.07mjn3once i stabalize the locale api, i plan to add a uClibc-specific support directory to libstdc++
06:40.09landleyThe main reason I can see for them doing this is making it easier to cross-build uclibc...
06:40.15landleyCool.
06:41.09landleyI gave up on c++ almost 10 years ago myself (about the time templates went in).  But there are some c++ packages I'd like to use.  (Konqueror comes to mind...)
06:41.25mjn3on the plus side, the core stdio rewrite is just about done.  all my tests are passing.  just testing a new (hopefully correct) popen implementation now
06:41.53mjn3yeah, i stopped following c++ just prior to STL
06:41.58landleyCool.  (What did the stdio rewrite accomplish, again?  I'm not up to speed on that bit of the code...)
06:42.22landleyIt's a pity Java went down the rathole it did.  I liked the language, circa 1.1 or so.
06:42.35landleyThese days, I use C for systems programming and python for apps.
06:42.46mjn3fixed a couple of minor bugs.  improved performance... especially for non-threaded apps that use getc/putc
06:43.02landleyCool.
06:43.24landleyDid you hear that the 2.6 linux kernel now has a -tiny tree?  (Which boots and runs comfortably in 4 megs again?)
06:43.36landley(Something that I'm not sure 2.4 _ever_ did... :)
06:43.53mjn3simplified the core code a lot.  the formatted i/o (printf/scanf) stuff i haven't touched other than necessary changes.  but i plan to rewrite vfprintf early next year
06:44.03mjn3i saw that
06:44.27landleyI'm entirely in favor of simplifying. :)
06:44.40mjn3btw... i fixed a long-standing bug wrt pthreads yesterday
06:44.46landley(That's actually my main attraction to busybox and uclibc.  I can understand what they DO...)
06:45.12landleyI just reinstalled my laptop with Fedora Core last week, and only just reinstalled the 2.6 kernel, so right now I haven't got a thing on here currently using uclibc yet...
06:45.40landleyI'm going to get my distro to build itself using busybox, and then I'm going to shoehorn uclibc into it.
06:45.55landleySo you've got yourself a bit of a breather before I start flooding you with bug reports. :)
06:45.56mjn3because of the configurability, the core stdio codebase was a mess of preprocessor logic.  i isolated a lot of that stuff now with macros in one internal header file
06:46.05landleyCool.
06:46.23landleyThat's one of the big things I learned from reading the linux kernel.  Making #ifdefs go away.  (Or at the very least seem to.)
06:46.31mjn3but we'll probably get another bug release out prior to committing the new stdio core
06:47.08landleyDarn it this depth tracking code is ugly no matter how I look at it...
06:47.20landley(It's three extra lines, but an UGLY three extra lines...)
06:47.27landleyTime to step back and drink some tea, I believe...
06:47.27mjn3perl 5.8.2 is failing one sigaction test out of 25.  passing all the others though
06:47.36landleyCool.
06:48.05landley(Okay what IS the deal with perl?  Modern perl is three times the size of the previous perl...)
06:48.14landleyI've heard of miniperl, but haven't played with it...
06:48.39mjn3i don't use perl myself.  but lots of people do and i build it just for the self-tests
06:48.51mjn3well, plus some things need it to build
06:48.58mjn3like the linux test project
06:49.01landleyLots of things need it to build...
06:49.09landleyMost of them are still happy with the older perl, though.
06:49.15landleyHopefully, some of them will be happy with miniperl.
06:49.22landleyThis is a to-do list item to find out in my new distro. :)
06:49.36mjn3probably
06:49.40landley(I'm basically aiming it at my laptop, which means it needs a full desktop, web browser, email, etc.)
06:50.06landleyAnd my server system on my cable modem, which means it needs apache, smtp, dns, iptables, etc...
06:50.25landleyI've already done a complete LFS based server system.  It's the desktop side, plus uclibc and busybox that are new.
06:50.34landley(And all the procedure changes in LFS 5.0...)
06:52.10mjn3apparently there's been a lot of work on using uClibc with gentoo
06:53.28landleyI saw that.
06:53.34landleyAnd andersee got debian building with it.
06:53.54landleyPersonally, I have this goal of minimizing the number of FSF packages in use on my system.
06:54.06landleyIt started when I began reading their code.
06:54.30landleyThe main things I do NOT have replacements for are binutils and gcc.  (tcc just ain't there yet...)
06:55.43mjn3there's lcc.  but last time i looked it didn't support 64 bit long longs and i don't know how good an optimizer it has
06:55.55mjn3hard to get away from gcc though
06:58.20landleyYup.
06:58.41landleytcc at least has (as an explicit goal) being able to build the linux kernel.
06:58.49landleyBut it has NO optimizer to speak of...
06:59.05mjn3and only support x86 i think
06:59.26landleyYup.
06:59.28mjn3we'll see.  i want to find out what breaks
06:59.39landleyIt's odd, both glibc and gcc have explicitly forked away from the FSF.
06:59.57landleyglibc is now maintained by Red Hat, and gcc is now maintained by the egcs committee.
07:00.17landleyThe FSF has largely proved itself incapable of maintaining projects past the initial coding stage.
07:00.56landleyYet somehow, the packages they claim credit for have this shared tendency to get bigger and bigger...
07:00.59landleyOh well.
07:01.01mjn3glibc is controlled by drepper.  he isn't high on my list
07:01.09landleyI know.
07:01.28landleyThere's a number of people's lists he isn't high on.
07:01.45landleyOh, speaking of which: does the new thread library work with uclibc?
07:01.51landley(Whichever one IBM surrendered to...)
07:02.06mjn3someone was working on porting it
07:02.38landleyDarn it.
07:02.38mjn3erik may have heard something
07:02.55landleyI've got an "this will almost never happen, and if it does we don't really care anyway" case...
07:03.09landleyAnd I'm slowing down a loop making sure it doesn't happen...
07:03.33landleyThis is the trouble with working from jseward's code.  There are assumptions baked into it that influence your thinking.
07:03.45landleyHalf the problem isn't what he's actually doing, it's "should we be doing this at all"?
07:03.49landleyHmmm...
07:04.16Lethalthe plan 9 compiler isn't bad, and is somewhat portable. gcc really isn't that bad though, once you run it through indent a few times..
07:04.29landleyBuffer size 900000, that's a little less than 2^20...
07:04.52landleySo a pathlogically unbalanced huffman code table could at worst have a depth of about 20...
07:04.59landleyAnd I've got 8 bits reserved here.
07:05.11landleyTherefore, adding it together is NEVER going to cause it to overflow out of the depth and into the weight.
07:05.18landleyRight.
07:05.35landleyI need to spend more lines commenting that than I just saved, but I don't care!
07:06.10landleyScrew it, I'll write separate documentation on the algorithm.
07:07.39landleyI haven't looked at the plan 9 compiler.
07:07.52landleyAnd I've tried to use gdb maybe a half-dozen times in my life.
07:07.58landleyEach one was a disaster of one sort or another...
07:08.14landley(Everybody wants to do integrated debugging with emacs.  I'm just not going there.)
07:08.25Lethalit doesn't get any worse then remote multithreaded debugging with gdb..
07:08.44mjn3that sounds pretty ugly
07:08.47landleyDarn it, is there a standard "max(x,y)" macro anywhere?
07:09.09landleyI've done remote multithreaded debugging under OS/2, but that was with an IBM debugger.
07:09.24landley(Don't remember if it was through the network or a serial cable.  Network, I think.  _TOKEN_RING_ network...)
07:10.20Lethalmjn3, thats not the half of it. gdbserver and gdb use two seperate threaddbs that don't quite behave the same. the gdbserver side delays thread creation and slowly feeds it to the client, so you have to wait awhile for your threads to show up, sometimes they don't, etc. and the gdbserver side doesn't use the notification system that the host does..
07:12.15Lethallandley, include/linux/kernel.h has appropriate min/max implementations. thats probably as standard as you'll get.
07:13.02landleyThat's not getting included into busybox. :)
07:13.29mjn3oh... in busybox.  i don't know.  i thought you were talking about std c
07:14.04landleyI got used to threads.  Programming in OS/2 and then switching to java was actually pretty good preparation for debugging SMP deadlocks in the linux kernel... :)
07:14.10landley(Or better yet, avoiding them...)
07:15.44landley13392 bytes for a bzip compressor executable.
07:16.18landleyOver half of that is the block sorting stuff I haven't touched yet.
07:17.06landleyWhat's the command to see how big each piece of something is?  (argument to nm, but I forget what...)
07:18.11landley[landley@driftwood bzip2]$ nm --size-sort -S bznew
07:18.11landley0804be44 00000001 b completed.1
07:18.11landley0804acc4 00000004 R _fp_hw
07:18.11landley0804acc8 00000004 R _IO_stdin_used
07:18.11landley080495af 00000017 T main
07:18.12landley0804bd00 00000038 d incs
07:18.14landley0804ac40 00000044 T __libc_csu_fini
07:18.14mjn3nm -S --size-sort -t d    will give sorted sizes in decimal
07:18.16landley080486b2 00000045 T flush_outbuf
07:18.18landley0804abf8 00000048 T __libc_csu_init
07:18.20landley0804a8bc 0000008a T BZ2_blockSort
07:18.22landley080486f7 000000c6 T put_bits
07:18.24landley0804a946 000000de t fallbackSimpleSort
07:18.26landley08048484 000000f9 T init_bzip_data
07:18.28landley08049480 0000012f T compressStream
07:18.30landley0804857d 00000135 T input_rle_data
07:18.32landley08049cb9 0000019c t mainSimpleSort
07:18.34landley0804aa24 000001d1 t mainGtU
07:18.36landley080495c8 00000358 t fallbackQSort3
07:18.38landley08049920 00000399 t fallbackSort
07:18.40landley08049e55 0000050a t mainQSort3
07:18.42landley0804a35f 0000055d t mainSort
07:18.44landley080487bd 00000cc3 t output_mtf_data
07:18.46landleyHeh heh heh. :)
07:18.48landley[landley@driftwood bzip2]$ nm -S --size-sort -t d bznew
07:18.50landley134528580 00000001 b completed.1
07:18.52landley134524100 00000004 R _fp_hw
07:18.54landley134524104 00000004 R _IO_stdin_used
07:18.56landley134518191 00000023 T main
07:18.58landley134528256 00000056 d incs
07:19.00landley134523968 00000068 T __libc_csu_fini
07:19.02landley134514354 00000069 T flush_outbuf
07:19.04landley134523896 00000072 T __libc_csu_init
07:19.06landley134523068 00000138 T BZ2_blockSort
07:19.08landley134514423 00000198 T put_bits
07:19.10landley134523206 00000222 t fallbackSimpleSort
07:19.12landley134513796 00000249 T init_bzip_data
07:19.14landley134517888 00000303 T compressStream
07:19.16landley134514045 00000309 T input_rle_data
07:19.18landley134519993 00000412 t mainSimpleSort
07:19.20landley134523428 00000465 t mainGtU
07:19.22landley134518216 00000856 t fallbackQSort3
07:19.24landley134519072 00000921 t fallbackSort
07:19.26landley134520405 00001290 t mainQSort3
07:19.28landley134521695 00001373 t mainSort
07:19.30landley134514621 00003267 t output_mtf_data
07:19.53landleyMy functions are init_bzip_data, input_rle_data, flush_outbuf, put_bits, output_mtf_data, compressStream, and main.
07:20.05landleyoutput_mtf_data is the meat of the program.
07:21.11mjn3cool
07:21.41landleyPosting to the busybox list in about 5 mins, want to run some more tests first...
07:21.43mjn3gcc final build failed without the fake glibc defines  :-(
07:22.13landleyMy lack of surprise is boundless, unlimited...  well, perhaps merely immense.
07:22.33landley"Incestuous" is the word that comes to mind when thinking about that code...
07:22.35mjn3indeed.  i'll look at it later
07:22.39mjn3very true
07:22.57landley"The gcc's connected to the... binutils, the binutils' connected to the... glibc..."
07:23.09landley(There's no ascii muscial notation. :( )
07:23.56landleyWhat exactly ARE the fake glibc defines?
07:24.00landley(How many of them are there?)
07:24.17landleyI ask because it might be possible to pass them on the command line for the builds that need them.  -DSHUT_UP_FSF
07:24.34landleyIf it's only one or two packages...
07:25.47mjn3just 3.  __GNU_LIBRARY__, __GLIBC__, and __GLIBC_MINOR__
07:26.22landleyI'd try building gcc with those forced in either the configure or make step...
07:26.23mjn3probably need to get the configure stuff fixed before we can hope that gcc might behave
07:26.41landleyWhich configure stuff needs to be fixed?
07:27.18mjn3recognizing something like <arch>-*-<{uc}linux>-uclibc
07:27.47landleyAh.
07:27.51landleyWorthy goal.
07:28.09landleyI'm fiddling with ./configure stuff trying to get busybox to work as part of a development environment.
07:28.20landley(That's how I got into fixing sed in the first place.)
07:28.31landleyThat's about to come off the back burner...
07:29.09mjn3worthy goal... that other people seem to want to work on.  i've got too much lib stuff i want to work on right now to mess much with the toolchains
07:29.55landleyI know the feeling...
07:30.06landleyDarn it, I just broke the thing.  what did I do?
07:31.53mjn3and seeing that other people can work on the toolchain stuff, but i'm the only one likely to work on the l10n/i18n issues...
07:32.18mjn3which is really still a big mess, since it was meant as purely experimental code to feel my way around the issues involved
07:33.09landleyGood luck there.  I've never cared much for internationlization.  (I know it's important, but deep down I can't help but view the world's persistent habit of not speaking english as a major annoyance, on a purely pragmatic level.)
07:33.19landleyA personal failing, I know...
07:33.36landley(I've had to _DO_ internationalization.  Been paid to do it even.  But it's not something I work on for fun.)
07:35.22landleyWow, the amount of damage a missing ~ can do... :)
07:35.38mjn3really all of that work was funded.  the internal code really shouldn't take much more to get into shape
07:36.51landleyYeah, I generally find being paid to work on something one of the stronger motivations. :)
07:36.52mjn3the painful thing is going to be writing a real localedef utility so that we can remove the dependency on glibc.  right now, much of the locale data is being generated by running glibc-linked applications
07:37.15landleyHow much is involved in defining a locale?
07:37.30landleyDoes it work from some kind of source textfile or something?
07:37.36mjn3yes
07:37.46landley(I vaguely remember dealing with this a looooong time ago...)
07:37.57landleyCan we maybe just parse the source textfile directly?
07:38.07landleyDoes any system actually NEED 8 gazillion active locales at any given time?
07:38.42landley(I assume the system is using one, and that apps override it with some kind of library call if need be.  So the parsing overhead shouldn't be too crazy...)
07:38.56mjn3much of the data is shared.  not in glibc's implementation of course
07:39.11landleyShared between apps?  Between locales?
07:39.20mjn3between locales
07:39.39landleySo a locale could be defined as a delta vs another locale?
07:39.40mjn3much of my work was in identifying and removing redundant data
07:40.05landleyWhat kind of information is in a locale?  If it could be some kind of rcfile with keyword=value pairs...
07:40.32landleyThen the new ones could be parent=filename and some more keyword=value pairs to override stuff in the parent file...
07:40.53landleyOr words to that effect... :)
07:41.31mjn3the bulk of the data is for the collation tables
07:41.39landleyWhat's a collation table?
07:41.50mjn3of the current 250k or so, about 190k of that is for collation
07:41.58landleyFun.  What's a collation table?
07:42.12mjn3strcoll, strxfrm, wcscoll, wcsxfrm
07:42.22mjn3sorting of strings per locale/codeset
07:42.43landleyThis is going to involve unicode, isn't it?
07:42.49mjn3yes
07:43.30landleyI thought there was one big unicode table that had all the unicode information in it?  (I know sun lied to me about a lot of things, but this assumption has so far remained unchallenged...)
07:44.04landleyHow are strings even represented when passed to strcoll?
07:44.17mjn3for the ctype stuff, yes.  and there is a unicode collation algorithm.  but that is tunable for different cultural sorting conventions
07:44.41mjn3strcoll uses multibyte strings in the current locale's encoding
07:44.56mjn3wcscoll uses wchar_t strings
07:45.09landleyIs there a minilanguage for representing cultural unicode string sorting conventions?
07:45.45landleyBoth are translated to full sized unicode internally?
07:45.57landley(What's "full sized" these days?  32 bits, or still 16?)
07:46.04mjn3kind of... but it is still table-driven.  the trick is in identifying redundancies in the tables
07:46.10mjn332
07:46.30landleyIf they go to 64 bit unicode when we go to 64 bit processors, I'm going to point and laugh.
07:46.31mjn3although you can actually select with gcc.  but glibc and uClibc assume 32
07:46.34landleyI'm just warning you now...
07:46.52mjn3actually, unicode is fixed at a little over 20 or 21 bits
07:47.05landleyhow are the tables represented?  (Is there an example somewhere?)
07:47.06mjn3UCS-4 is 32 bits
07:47.36landleyOkay, strings are always sorted in some variant of "linearly", right?
07:47.48landleyBy which I mean get next symbol, compare, and have the ability to stop.
07:48.13landley(Taking into account that "get next symbol" could be going right to left...)
07:48.15mjn3look at /usr/share/i18n/locales/iso14651_t1 as an example
07:49.00mjn3strings are sorted in 4 passes where codes have weights specified at each level (pass)
07:49.25mjn3at some levels, sorting is backwards (right to left)
07:50.00mjn3it is also possible for a sort to depend on the number of ignored chars previous to the 2 being compared
07:50.10landley# The comment at the beginning of this section mentions characters which
07:50.10landley# are not otherwise covered.  But this description cannot express this.
07:50.10landley# Therefore we add here a few entries which are used in older implementations
07:50.10landley# to be compatible.  --drepper
07:50.14landleyOuch.
07:50.32landleyWhat ARE ignored chars?
07:50.49mjn3that file is just the basic template for a number of locales.  they can include it and provide overrides
07:51.03landleyI take it that order in this file is significant?
07:51.15landley<hamza>
07:51.15landley<alef>
07:51.15landley<beh>
07:51.15landley<peh>
07:51.18landleyAnd all that?
07:51.21mjn3gives the weight
07:51.51landleySome say collating-symbol <4>
07:51.56landleyAnd some just say <4>
07:51.57mjn3it took me about a month and a half to come up with a way to compress all that
07:52.09landleyI'm still trying to figure out what it _means_ :)
07:52.14mjn3there's a specification pdf somewhere.  if you're interested, i can look it up
07:52.34landleyI must admit to a certain morbid curiosity...
07:52.41landleyBut the word "rathole" is coming to mind as well...
07:52.51landleyLemme clear my current to do list a bit first. :)
07:52.52mjn3as a comparison, glibc's locale archive for the locales we support in 250k or so is roughly 25-30MB
07:53.16landleyThe file I'm looking at is a glibc file.
07:53.51mjn3yes... but that's a source file.  localedef parses it and others to generate a locale file... /usr/share/locale
07:54.02landleyThat part I knew.
07:54.07mjn3recent glibc's collect all locales in a locale archive
07:54.19landleyIt creates an evil binary blob that I pretty much wanted nothing to do with last time I built a system...
07:54.33mjn3indeed
07:54.40landleya locale archive is not an improvement.  Bundling together bad ideas gives you one big worse idea.
07:55.11landley(And if you're going to archive stuff, use zip.  I think the zip format is underappreciated, myself.)
07:55.40landleyIt's pretty closed to a compressed filesystem.  The index isn't compressed, so you can seek to any file immediately and extract just that file...)
07:56.05mjn3the file get's mmap'd as i recall
07:56.13landleyMaking a tarballfs is virtually impossible, but making a zipfs (read-only) would be pretty straightforward...
07:56.14mjn3or at least the relevant parts of it
07:56.34landleyHow many locales do you EVER need to look at at once?
07:56.39landley(What, three?)
07:57.02mjn3anyway, the main goal for my stuff is to provide fairly comprehensive locale support in a small amount of space for something like a pda
07:57.23landleyWell, you're an order of magnitude ahead of glibc so far.
07:57.41landleyI gotta spend 5 minutes in another desktop sending my bzip implementation to the busybox list...
07:58.02mjn3if you can budget 300k (less if compressed in flash) for locale data, then you can market your product unchanged in a lot more markets
07:58.34landleyOne more test...
07:58.57landleyThat's a nice selling point, yes.
08:01.19mjn3another issue on my list to address is message object files for gettext.  for a given application, you wind up storing the keys plus the translations for each language.  that eats up a lot of space for redundant data, so i'm looking at something like a message object archive that stores the keys once for all the supported translations
08:03.05landleyHmmm...
08:04.18landleyIt could also be compressed pretty trivially.  (It's sounding like strerror).
08:04.25landleyWhat uclibc did with it, I mean...
08:05.34mjn3on something like a pda, typically that stuff will be compressed anyway in a jffs2 filesystem or somesuch in flash
08:05.45landleyTrue.
08:06.01landleyIn which case the duplicated keys are also compressed, although it's still a waste.
08:06.20landleyDepending on how the data is grouped, there might be a benefit there anyway.
08:06.32landleyPutting a lot of text together gives the compressor something to work with...
08:06.52landley(I dunno how gettext works.  The key isn't an offset into an array or anything, is it?)
08:07.01landleyNo preprocessor magic to be done here...?
08:07.30mjn3the key is the untranslated string.  then there's a hash table that indexes into a translation table
08:07.56landleyI vaguely remembered that, I was just wondering if all the hashing and stuff had to be done at runtime...
08:08.08mjn3yes
08:08.22landleyPity.
08:08.44landleyDarn it, it made it bigger again!
08:09.00landley(Taking that -18 out makes the file bigger.  Every time.  I wonder why?)
08:10.19mjn3there's also catgets stuff, which doesn't have a runtime penalty.  but it is more intrusive.  most apps (for linux anyway) use gettext
08:11.30landleyI have some vague recollection of gettext being obsoleted in favor of gettext, which makes no sense.  Some kind of .(blah) syntax was going away in favor of something else...
08:11.39landleyThis was a couple years ago, though...
12:25.06*** join/#uclibc Qui_Gon (fox@81.185.48.139)
12:25.33*** join/#uclibc ambassador_ (~ambassado@h72.149.40.69.ip.alltel.net)
14:38.37*** join/#uclibc Qui_Gon (fox@83.21.185.81.internet9tcollecte.9massy1-1-ro-bas-2.9tel.net)
14:48.51*** join/#uclibc dsmith (~user@mail.actron.com)
17:20.57*** join/#uclibc randey (~randey@202.63.116.98)
17:59.35*** join/#uclibc DavidM (~david@h24-207-7-221.dlt.dccnet.com)
19:15.11*** join/#uclibc andersee (~andersee@codepoet.org)
23:32.19*** join/#uclibc TheMasterMind1 (~aman@h-66-167-234-81.MCLNVA23.dynamic.covad.net)

Generated by irclog2html.pl by Jeff Waugh - find it at freshmeat.net! Modified by Tim Riker to work with blootbot logs, split per channel, etc.