IRC log for #infobot on 20080315

14:23.08*** join/#infobot TimRiker (n=timr@70.1.153.92)
14:23.08*** mode/#infobot [+o TimRiker] by ChanServ
15:33.48CIA-26infobot: 03timriker * r1635 10/trunk/ (README infobot src/Factoids/Core.pl src/Modules/Units.pl): more bloot refs
15:34.56troubledTimRiker: at least it wasnt blootbot svn this time ;)
15:38.15TimRikerheh. yeah. my kid's at a chess tournament, so I'm hacking on the bot. planning to move over to infobot today.
15:47.02TimRikerwhat happened to textstats_main ?
15:53.26troubledTimRiker: textstats_main?
15:53.39troubleddoesnt ring any bells
15:57.01troubledbtw, there is still a 3600s timeout problem. havent been able to track it down yet though. i spent a bit looking, but its going to require a deep search possibly as to why.
15:57.33*** join/#infobot abot (n=timr@pdpc/supporter/active/TimRiker/bot/apt)
15:57.33*** mode/#infobot [+o abot] by ChanServ
15:57.45TimRikerabot: version
15:57.47troubledseems though that the ping/pong just stopped. could have been related to the scheduler changes I made, but I just cant see missing that one. tbh, I think it only worked before because the scheduler could be random
15:58.27TimRikerso does the bot disconnect now? or what happens?
15:58.33TimRikerabot: owner
15:58.38troubledso when I fixed it and it became 3600 exactly, it could have started. its hard to notice though since I think it only happens when it doesnt get any channel events
15:58.41TimRikerabot: are you alive?
15:59.02TimRikerabot: bzflist
15:59.04abots=1
15:59.08troubledya, just times out, and reconnects the alt nick. but the second one never times out
15:59.48TimRikerhmm.
15:59.48TimRikerwhat happens with multiple nicks then?
15:59.48troubledunless my perl libs got messed up some how
16:00.07TimRikerinfobot: bzflist
16:00.15TimRikerabot: bzflist
16:00.17abots=1
16:00.22TimRikersomething is broken.
16:00.27troubledit doesnt try to recover back to the first nick. i havent tried using 2 or 3 simultaneous nicks though. didnt know you could til I ran into you in the channel that day
16:02.12troubledits odd though since it doesnt affect ravenbird. im suspecting I only noticed cause my bots in idle channels, his isnt
16:02.59CIA-26infobot: 03timriker * r1636 10/trunk/files/sample/infobot.chan: +#infobot
16:03.23troubledcheater :p
16:03.29TimRiker:)
16:06.28troubledi recall spending an hour or so looking into it, but I think the ping/pongs were all fine
16:07.13troublederr, what I mean is that the code for the ping pong looked ok, but for some reason it just wasnt happened. so i get the feeling ravenbird is only staying online because of the channel spam
16:07.15*** join/#infobot |bot (n=timr@pdpc/supporter/active/TimRiker/bot/apt)
16:07.16*** mode/#infobot [+o |bot] by ChanServ
16:07.34TimRiker|bot: foo
16:07.35|bot[foo] bar
16:07.38troubledim hanging around the keyboard though, so feel free to hilight me as needed
16:07.50TimRikerkk
16:08.04TimRiker~bzflist
16:08.06|bots=1
16:09.03*** join/#infobot |bot (n=timr@pdpc/supporter/active/TimRiker/bot/apt)
16:09.03*** mode/#infobot [+o |bot] by ChanServ
16:09.08TimRiker~bzflist
16:09.13|bots=263 PLAYERS(263) BZFS0026(233) bzflagr.net:5154(32) BZFS0058(30) bzflag.planet-mofo.com:4200(21) 06.bztank.net:5159(19) bzexcess.com:25155(15) bzexcess.com:25154(15) 01.bztank.net:5154(12) badgerking.fiedlers.us:5667(9) fairserve.bzflag.org:5220(8) bzflag2.norang.ca:4418(8) bzf.bzflag.org:5155(8) 81.171.22.238(7) dkdk.homelinux.com:5154(6) 05.bztank.net:5158(6) viper2.pimpi.org:5158(5) dub.bzflag.org:5150(5) bzflag2.norang.ca:4409(4)
16:09.39TimRikerit helps if I remove my broken proxy setting from the config.
16:19.16TimRiker|bot: chanset autojoin
16:19.57TimRiker|bot: chanset autojoin
16:19.58|botShowing autojoin values on all channels...
16:19.58|bot<PROTECTED>
16:19.58|bot<PROTECTED>
16:19.59|bot<PROTECTED>
16:19.59|botEnd of list.
16:20.16TimRiker|bot: chanstats
16:20.17|botI'm on 4 channels: #botpark/21, #debian-bots/14, #infobot/10
16:20.17|boti've cached 45 users, 36 unique users, distributed over 4 channels.
16:20.46TimRiker~version
16:20.46infobotit has been said that version is Kernel: uname -r.  Debian: cat /etc/debian_version. and: dpkg -l libc6 | tail -1.  Package: dpkg -l _pkgname_.  Miscellaneous: try --version on the command. check /etc/apt/sources.list; run apt-cache policy <packagename>, or lsb_release -d
16:21.38TimRiker~status
16:21.38infobotSince Sun Mar  9 03:58:25 2008, there have been 58 modifications, 748 questions, 0 dunnos, 0 morons and 456 commands.  I have been awake for 6d 12h 23m 13s this session, and currently reference 114513 factoids.  I'm using about 18824 kB of memory. With 0 active forks. Process time user/system 13316.16/535.33 child 0.02/0.07
16:21.39|botSince Sat Mar 15 16:08:48 2008, there have been 0 modifications, 0 questions, 0 dunnos, 0 morons and 1 command.  I have been awake for 12m 50s this session, and currently reference 1 factoids.  I'm using about 16660 kB of memory. With 0 active forks. Process time user/system 0.79/0.04 child 0/0
16:25.38TimRiker~convert 1 inch to microlightyears
16:25.40|bot1 inch cannot be converted to microlightyears:
16:25.54TimRikerwho broke units.pl and why?
16:27.41TimRikerah. converted to use the perl one. don't think that's a Good Thing actually.
16:27.51TimRiker~convert 1 inch to microlightyears
16:27.53|bot1 inch cannot be converted to microlightyears:
16:28.02TimRikernice brokenness.
16:29.02TimRikermy $response = readline ($units);    huh?
16:30.58troubledI think we went over the change to the packages unit convertion with dondel a while back
16:31.30TimRiker~convert 1 inch to microlightyears
16:31.31troubledno idea how readline worked its way in there
16:31.32|bot1 inch is approximately 2.68478e-12 microlightyears
16:31.44TimRikerah... it's NOT the perl units, its the command line "units"
16:31.50troubledya
16:32.03TimRikerand no useful message when it's broken it appears.
16:32.19troublednor mention of the dependency in the readme :)
16:32.34troubledleast i dont think i added one for it
16:32.37TimRiker!WARN! ircCheck: ident(|bot) != param{ircNick}(abot,|bot).
16:33.02TimRikerthat's broken. though it might have been broken before too.
16:33.36troubledi seem recall knowing about that. cant remember exactly what it was atm though
16:34.03troubledi think the irccheck code has to be moved into the scheduler.pl or something
16:34.17troubledsomething to do with variable scope iirc
16:34.45troubledits all a little hazzy for me though, might be thinking of something else
16:38.06TimRiker~convert 12 pence to sixpence
16:38.08|bot12 pence cannot be converted to sixpence: Unknown unit 'sixpence'
16:38.22TimRiker~convert 12 inches to barleycorns
16:38.24|bot12 inches is approximately 36.0001 barleycorns
16:39.44TimRiker~convert 12 louis to livres
16:39.46|bot12 louis cannot be converted to livres: Unknown unit 'louis'
16:40.57TimRikerso generally we threw away he working full featured one and added a smaller limited version. why did we do that again?
16:41.11TimRikers/ he/the/
16:41.12|botTimRiker meant: so generally we threw awaythe working full featured one and added a smaller limited version. why did we do that again?
16:41.16troubledchecked my logs. havent seen that ircCheck: ident" error you got. guess i never ran into cause i never used it
16:41.56TimRikeryeah, it would only happen with multiple nicks.
16:42.02troubledi think it was to do with not having to maintain the convertions and remove the file from the src wasnt it?
16:42.06TimRikerjust not sure if it used to happen or not.
16:42.13troubledI can pull up the conversation if you like
16:42.19TimRikersure.
16:42.31troubledgive me a few minutes
16:42.57TimRikerwas there a lot of work in maintaining the conversions? /me does not recall that there was.
16:44.48troubledi thought it was because there was inaccuracies, but I might be thinking of currency convertions
16:46.54TimRikerdunno
16:47.29troubledwaiting to get a rough date of the conversation. takes a bit :)
16:48.43TimRiker~convert 1 `ls` to `cat`
16:48.45|bot1 `ls` cannot be converted to `cat`: Unknown unit '`ls`'
16:49.40TimRiker~convert 1 `ls>foo` to `cat`
16:49.42|bot1 `ls>foo` cannot be converted to `cat`: Unknown unit '`ls>foo`'
16:51.03troubled< dondelelcaro> troubled: NewUnits.pl calls units which uses the full /usr/share/misc/units.dat file; our Units.pl sucks in comparison
16:51.29troubledoct. 19th 2007 is the day we are after
16:52.23TimRikerhmm. k
16:52.57troubledi think it may have been afected by your changes of the bot and don's as well though
16:55.05TimRiker~convert 1 yotaainch to lightyears
16:55.07|bot1 yotaainch cannot be converted to lightyears: Unknown unit 'yotaainch'
16:55.52TimRiker~convert 1 yottainch to lightyears
16:55.54|bot1 yottainch is approximately 2.68478e+06 lightyears
16:56.16TimRiker~convert 1 yottameter to metres
16:56.18|bot1 yottameter is approximately 1e+24 metres
16:57.25TimRiker~convert 1 zettameter to metres
16:57.28|bot1 zettameter is approximately 1e+21 metres
17:00.15TimRiker~convert 1 zettainch to metres
17:00.16|bot1 zettainch is approximately 2.54e+19 metres
17:01.14TimRikerI think the new one is right actually. there was a debate back when yotta was added. unsure outcome.
17:02.13troubledwell, i got the conversation with don about it in front of me. just cheap arse sql program I used only exports as cvs :/
17:04.46troublednor does it escape the delimeter....grrr
17:05.33TimRikerheh. s'ok. I'll go with the flow and keep the new one.
17:06.07TimRikerwish it would do us temp conversion
17:07.19troubledsorry, normally im in windows where Im already setup to access the logs. just put ubuntu on though yesterday so i havent had a chance to do much yet but config stuff
17:07.52CIA-26infobot: 03timriker * r1637 10/trunk/src/Modules/Units.pl: might as well strict
17:15.22troubledTimRiker: http://pastebin.ca/raw/943836  (you got 15 minutes)
17:15.37troubledlook around 6am - 8am when we talk about it
17:15.56troubledsorry for the delay, firefox was being braindead until i went through the proxy
17:17.37troubledit seems that mebi units were the big concern since #debian users practically live off of that converstion unit
17:18.52troubledseems I renamed his NewUnits.pm to Units and went with that
17:24.07TimRikerhttp://ibot.rikers.org/%23infobot/20071019.html.gz
17:24.53troubledi was going to mention the bot logs, but figured you would have already after I gave the date :)
17:25.17troubledI poked don in #-bots, but no idea if he's around atm
17:25.40troubledhes been afk for like 14 hours  though
17:29.49TimRikerk. I'm in.. just missed the conversation back when it happened.
17:30.40troubledi think we (you and I) talked about it as well at some point iirc, but probably +/- a day or two
17:37.42TimRikeryeah, I vaguely recall something. I'm just reviewing changes prior to moving my bots over.
17:38.18TimRikerwould be nice if tcp version shows the subversion rev.
17:39.51troubledthe bot version issue never was finalised and I didnt get a chance to actually decide on a naming/numbering scheme
17:40.46troubledcould just go with a 1.$rev or something for now and just put the $Id$ tag in the version file and alter the parsing a tad
17:41.37troubledmy problem with that was that the file way wouldnt work unless it changed manually each rev
17:42.18TimRikerthere's an entry like that in debian bugs, but it's not filled in. q$Rev: 22 $   should get filled in, no?
17:43.40TimRikerwhat's the point in putting the version in a separate file? just as easy to touch the actual file, no? or is it that core.pl does not reload with "reload" ?
17:43.53TimRiker~reload
17:43.53infobotreloading...
17:43.53infobotreloaded:
17:43.54|botreloading...
17:43.54|botreloaded:
17:44.28TimRiker|bot: die
17:44.58TimRiker!WARN! PERL: DBD::SQLite2::st execute failed: columns nick, channel are not unique(1) at dbdimp.c line 419 at ./src/dbi.pl line 440.
17:46.10TimRikermight be trying to update seen once per connection.
17:46.44troubledhmmm, i didnt put much into sqlite tests tbh. im a mysql man
17:47.19troubledsqlite is nice though, but between HEAD, your db and don's, we got too much legacy to preserve
17:53.33CIA-26infobot: 03timriker * r1638 10/trunk/src/ (IRC/Schedulers.pl Modules/Debian.pl modules.pl): ws
17:54.08troubledws?
17:55.05troubledTimRiker: did you update your schema yet to the current one?
17:55.32troubledmight want to csv dump it or something and reorganize a bit just in case
17:55.45*** join/#infobot Ttech (n=ttech@fullcirclemagazine/developer/ttech)
18:01.24TimRiker~ws
18:01.24infobothmm... ws is short for workstation.  White Space, or the country code for Western Samoa
18:01.33TimRikerwhite space in this case.
18:01.52TimRikertroubled: nope. what's needed?
18:02.51troublednot sure what your schema diff is, might want to dump schema and compare to HEAD sqlite schema though in case that was causing the problem
18:03.17troubledwhile I dont use it much, I do recall not having any problems with a fresh bot using sqlite
18:03.20TimRikerare we tracking "seen" in multiple channels now per nick?
18:03.38troublednot sure
18:03.42TimRiker~seen ravenbird
18:03.45infobotravenbird is currently on #botpark (1d 17h 51m 33s) #debian (1d 17h 51m 33s) #infobot (1d 17h 51m 33s) #debianppc (1d 17h 51m 33s) #debian-bots (1d 17h 51m 33s). Has said a total of 1 messages. Is idling for 1d 4h 5m 35s, last said: 'Welcome back xk, o lonely traveller amongst the TCP/IP packets of chalk.'.
18:03.46ravenbirdYou are moron #1, infobot
18:03.55troubledheh
18:04.02TimRikerhehehe
18:04.03troubledaparently
18:04.18TimRiker~seen infobot
18:04.19infobotinfobot is currently on #utos (1d 23m) #maemo (1d 23m) #fredlug (1d 23m) #opensimpad.org (1d 23m) #android (1d 23m) #asterisk-bugs (1d 23m) #udbug (1d 23m) # (1d 23m) #wowhead (1d 23m) #infobot (1d 23m) #lugot (1d 23m) #mseide (1d 23m) #wowwiki (1d 23m), last said: 'TimRiker meant: fixing it now to not msg if ...
18:04.25TimRiker^seen infobot
18:04.28ravenbirdTimRiker: i haven't seen 'infobot'
18:04.34TimRikeraw
18:05.21troublednot sure if he runs +seen in here or not
18:05.54troubledim assuming no from the reply, but who knows when it comes to bugs :)
18:17.42*** join/#infobot |bot (n=timr@pdpc/supporter/active/TimRiker/bot/apt)
18:17.42*** mode/#infobot [+o |bot] by ChanServ
18:38.16simonrvntroubled: hi
18:44.21troubledsimonrvn: did you have +seen in here?
18:46.10simonrvn*shrug*
18:53.13simonrvn^reload
18:53.13ravenbirdreloading...
18:53.14ravenbirdreloaded:
18:53.22simonrvn^die
18:56.42*** join/#infobot ravenbird (i=ravenbir@unaffiliated/simonrvn/bot/ravenbird)
19:18.17TimRikerwhy do we index seen as nick,channel ?
19:18.58TimRikershould be unique on nick from what I can see.
19:20.14TimRikerI'm also a non-fan of having 2 separate sets of sql create scripts. there are differences there that should not be there. In fact, I don't know of any differences that _should_ be there.
19:20.20simonrvnindeed. i think i edited my table to be that way
19:20.25TimRikerer make that 4 separate sets.
19:21.08TimRikeranyone recall who split them up and why?
19:22.31simonrvntroubled's in trouble :P
19:23.33troubledTimRiker: i split em due to differences in syntax
19:23.55troubledsqlite lite's syntax is limited compared to say pgsql
19:24.51troubledas for sqlite and sqlite2 schemas, i wasnt sure about them back then and was thinking ahead in case they needed to be different
19:25.07troubledsqlite and sqlite2's schemas can probably be merged to 1
19:25.16simonrvnsqlite3 has better real sql syntax than 2
19:25.53troubledi know mysql and pgsql _must_ be different though. unless we are going to go with some sort of odbc interface to it all
19:26.22simonrvnthat would be better, long term
19:26.46TimRikertroubled: so what is actually different? is there not one syntax that works for all? then only one place to maintain it?
19:27.03TimRikerwhat _must_ be different?
19:27.06troubledno, pgsql and mysql syntax is completely diff
19:27.17simonrvnbecause then you abstract data access, and simplify code
19:27.18troubledmainly quotes syntax and field types though
19:28.03troubleddiff one of files in the  schemas and you will see what I mean
19:28.08TimRikerthere's a revoke in the pg, but that's not needed, it's just there.
19:28.23troubledwhich file?
19:28.36simonrvnthat's why things like mysql2pgsql exist ;)
19:28.36TimRikersimonrvn: no, in this case, you hide bugs by not testing all of the different dbs.
19:28.48TimRikerthey are out of sync now. ie: bug.
19:29.24TimRikersimonrvn: so infact, all you have done is hidden interfaces, complicated the testing, and introduced bugs.
19:29.27simonrvnhm, good point
19:29.35troubledas for the index on nick,channel, I believe it may be an efficiency thing because of the way the sql select is done. i would have to look at it closer to be sure if its needed though
19:30.21TimRikerthe question is... should there be multiple entries for a nick in there or not. I say not, but having nick,channel means that there can be.
19:31.57troubledno, there shouldnt, but we have to keep it for legacy. tbh, the schemas we have are so terribly designed, it would be better to start from scratch, but then I piss the hell out of dondel
19:32.20troubledthere is absolutely _0_ normalization, but we cant just change it without breaking all your dbs
19:33.00troubledso yes, there should be a nick table, referenced by id instead of 40000 nick dupes. but unless you guys are going to change your dbs, you will loose everything
19:33.57troubledi figure I could take don's db to 40% of its current size just from normalizing a bit
19:34.00TimRikerno, I'm not a fan of a separate nick table. nick is passed in as text, it should be stored that way imho.
19:34.16*** join/#infobot cheese (n=Rambo111@adsl-10-128-163.mia.bellsouth.net)
19:34.23troubledyou should take a look at my db logger schema to see what i mean
19:34.29cheesehey, I was hoping to get infobot working, but I'm having issues
19:35.08TimRikertroubled: imho it's silly to log to a db, so you get what you asked for. :)
19:35.11cheeseit won't seem to work both with my mysql database, and also isn't working when I give it the root login/passwd
19:35.22troubledTimRiker: for proper normalization and db size savings, the nick, ident _and_ hostname must be seperate
19:35.47cheeseohhh... wait
19:35.48TimRikertroubled: so that silly log tables are smaller? log to text. it's a _log_
19:35.49troubledbut the space savings would be minimal compared to my setup
19:36.04cheeseblootbot isn't infobot is it?
19:36.08troubledTimRiker: you seriously underestimate my db :)
19:36.19troubledits very efficient
19:36.19TimRikercheese: blootbot is now the new infobot, yes.
19:36.50TimRikertroubled: my 9 years of text logs are pretty efficient too.
19:37.04troublednot like this
19:37.18TimRikeryou underestimate bzip2
19:37.21troubledlog files lack full information for the queries I do
19:38.07TimRikerlike?
19:38.46troubledirc clients (and thus the db) store the full user info for each msg. logs only store nicks typically
19:38.58*** part/#infobot cheese (n=Rambo111@adsl-10-128-163.mia.bellsouth.net)
19:39.16troubledso parsing logs, its impossible at the forensic level to know for sure that a nicks host hasnt changed until they quit while you are logging
19:39.39troubledthen you run into problems like cut off logs which are hard to detect by script
19:41.10troubledand theres the fact that the db is fully index'd. so you finding something in a sequential search could take minutes versus a 5-10 second search in the db for the same thing
19:42.39troubledmy db design is at least efficient enough that it compressed so well that rzip only saved 2mb of size over bzip2. and rzip is design to highly compress large files (900mb uncompressed or so iirc) with poor normalization
19:42.53TimRikerthe logs record nick changes. so it'd doable.
19:43.04troubledif the avg(length(nick)) is < sizeof(int4), I save space
19:43.21troubledthats the kind of logic and planning that went into my design and still holds true.
19:43.33TimRikeryou presume that I often want to search 9 years of logs, which I don't.
19:43.57troubledits doable with a degree of inaccuracy. ive been over it with someone with years of logs running a test import script already
19:44.09TimRikerwhat I _do_ want are daily logs that google can index and folks can download.
19:44.19TimRikerdb logs don't give me either of those.
19:44.26troubledtrue, but there is a problem with accuracy in standard irssi logs
19:44.38simonrvnunless you parse them out to flat file via script
19:44.46TimRikerhence the reason I don't use so-called-standard irssi logs.
19:45.21TimRikerit's the bot's job to log, not the client. that's what bots are for. :)
19:45.32troubledif you are just looking up nicks or "words" then flat file is fine if the goal is just for google to index them for you i guess
19:46.39troubledi just find it more convienient to log via client. then me and the bot dont have to both be in 50 channels, doubling bandwidth for when I want to talk in a channel
19:46.40TimRikerI've got 88M of sql data. I've got over a gig of logs just on this version of the bot.
19:46.59TimRikerI don't want a anywhere near a gig of db to backup.
19:47.14troubledthat 88mb of sql could probably be stripped down to like 50 with some heavy normalization. maybe even more
19:47.55troubledbut the point of normalization isnt as much about space saving as it is faster lookups
19:48.20TimRikerI seriously doubt it. Most of that is factoid info or seen "last said" info. normalization won't help either of those at all.
19:49.07troubledthe reason I can pull out every record for any nick, ident or hostname out of a million records in ~100-200ms is because the table is small, and has a fixed size (thus calculated offsets)
19:49.14TimRikernormalization == slower lookups for simple queries. and that's about all the bot ever does.
19:50.10troubledits faster to normalize
19:50.18simonrvni don't log in irssi
19:50.42TimRikerso I guess the design would be different if I wanted to add a command to the bot that would let users search all 9 years of logs. that sounds like a Really Bad Idea right from the start.
19:51.28troubledlets take a look at your db and ill explain why its slow and why normalization would give you faster speed if done across all tables
19:51.45troubledtake a look at your row couunt for the seen table. how many you got?
19:51.49troubledand whats the avg size?
19:52.50troubledmysql> select avg(length(nick)) from nicks;
19:52.53troubled+-------------------+
19:52.55troubled| avg(length(nick)) |
19:52.58troubled+-------------------+
19:53.00troubled|            7.6225 |
19:53.03troubled+-------------------+
19:53.05troubled1 row in set (4.42 sec)
19:53.07troubledmysql> select count(*) from nicks;
19:53.10troubled+----------+
19:53.12troubled| count(*) |
19:53.15troubled+----------+
19:53.17troubled|   286546 |
19:53.19troubled+----------+
19:53.22troubled1 row in set (2.36 sec)
19:54.04troubledso, on freenode, the avg nick size is 7.6225 out of a distinct 286k nicks. Since my pkey is only 4 bytes, I am saving 3.6225 * 286546 bytes just on nicks
19:54.16troubledthe same thing applies to quit messages:
19:54.21TimRiker229590 entries in seen. some of which are probably bad due to the bad index setting.
19:54.29troubledmysql> select count(*) from quit_messages;
19:54.32troubled+----------+
19:54.34troubled| count(*) |
19:54.36troubled+----------+
19:54.39troubled|    32714 |
19:54.41troubled+----------+
19:54.44troubled1 row in set (0.42 sec)
19:55.37troubledso for months and months, instead of having thousands and thousands of quit messages logged, I only have 32k distinct ones, but the pkey id is only 4bytes, and the avg length of quit messages is:
19:55.41troubled+----------------------+
19:55.43troubled| avg(length(message)) |
19:55.46troubled+----------------------+
19:55.48troubled|              32.4172 |
19:55.50troubled+----------------------+
19:55.52troubled1 row in set (0.62 sec)
19:55.54TimRikertroubled: actually you're not saving that. you have 4 bytes in 2 tables plus the 7.6225 nicks.
19:55.59troublednow, _.62s_ is damn fast for 8 months of logs
19:56.13troubledi only have 1 nicks table
19:56.20TimRikerso you've added 286k*8 that you are not counting anywhere.
19:56.44TimRikerincrease that 8 by 4 more for each table where you use it.
19:56.51TimRikerand eventually you break even.
19:57.27troubledany time I break even, I make up for by smaller table size which equates to faster indexes and faster lookups
19:57.29TimRikerbut then add on top of that that every seen update is now a join on 2 tables, and you have much more sql overhead.
19:57.43troubledand its negated by the savings in tables like quit messages
19:57.48TimRikerno.
19:58.21TimRikerwhen you break even is exactly when you break even. you do NOT have smaller table sizes then. you have _even_ table sizes.
19:58.46TimRikercourse, I'm not counting index overhead for the new table, but it factors in at similar values.
19:59.09troubledmy user id tables which is a join of nickid, identid and hostid causes the break even in that area
19:59.49troubledbut i get accuracy of messages and 100 fold increase in lookup speed vs. sequential searches
20:00.23TimRikerso now your simple "update seen record" have morphed in to a "check for and potentially add these 3 entries in each of 3 tables, and then add one record in this table" instead of "update this table"
20:00.41troubledthe normalization comes into play best when i do a dump more so than on disk since index size sucks hd space (which is dirt cheap)
20:00.46TimRikerI don't call that breaking even.
20:01.37TimRikerso to save the "dirt cheap" disk space, you've added much mroe overhead for the typical insert/update.
20:02.10TimRikeryou've successfully optimized the thing you rarely do at the expense of the thing you do all the time.
20:02.44TimRikerthis is a step backward, not forward.
20:02.44troubledsure, you add a bit more overhead for faster lookups
20:02.44troubledyou cant get both at the same time heh
20:02.44troubledspeed is the goal, not size
20:02.44troubledor cpu
20:03.00TimRikerexactly. so proper database design says you optimize for the usage, not optimize blindly.
20:03.23TimRikerand my point is that you have increase cpu load significantly.
20:03.28troubledsure its lower overhead to just slap everything in a table without any normalizing, but then your lookup times exponential slow down
20:03.36TimRikerie: speed is worse, and size is better.
20:03.53troubledwe need speed though, not size
20:04.10TimRikeryou're missing the point. we don't do fancy lookups. we do brain dead lookups. and those are fast.
20:04.13troublednormalizing gives you that through smaller and fixed tables
20:04.29TimRikerexactly! we need speed, not size. and you have size and not speed.
20:04.40troubledyou are forgetting the terrible loss in performance you get in mysql with a non fixed table size due to the factoids being varchars
20:04.54troubledthats why you need the nuh's in a seperate table
20:05.01TimRikerheh, actually I'm not as I use sqlite which does not have that same issue.
20:05.16troubledwheithere its better to split that up into 3 tables depends on the amount of data savings you would get
20:05.48TimRikernor does it have the network overheard or task switching that a mysql install requires.
20:06.06troubledbut if you had at least the nick!user@host in its own table, it could be fixed table size and would be better the reference it
20:06.11TimRikercourse it _is_ perl, so that hardly registers. :)
20:06.28troubledbut the size savings only come into play with lots of duplicates, so it depends on your exact db's contents
20:06.59troubledwell, I agree sqlite is nice in that respect. mysql/pgsql do add overhead, but that also give you www page potential
20:07.28TimRikersqlite can be used by more than one process, so there's nothing stopping a web interface.
20:07.45troubledsqlite3 with the daemon can
20:07.52troubledor do you mean file locks
20:07.57TimRikereven without a daemon.
20:08.23troubledi dont like deadlock issues that come with sqlite though
20:08.35TimRikersqlite uses record locking normally to allow multiple processes to access the same db. that's how the daemon version works in the first place.
20:08.44troubledat least mysql/pgsql will "self correct" with corruption and such
20:09.17troubledyes, but in the daemon version, its the gatekeeper doing all the sharing. so its really just 1 process accessing it
20:09.56TimRikeragreed. if I had a web interface up, I most likely would not design it using sqlite shared file access. but I don't, so sqlite is the lowest overhead, and the fastest performance. ie: exactly what I need. in addition it's the simplest to get running.
20:10.43TimRikertroubled: re daemon, not exactly and the daemon allows multiple forks .. or threads? I guess I don't recall.
20:10.46troubledwe could just go sqlite only, i wouldnt mind. we can always put a web interface into the bot later to deal with people that want that sorta thing
20:11.37troubledre daemon: ya, but it does have deadlocking and timeout code I would assume. still not the same as if infobot choked and blocked, then the daemon wouldnt know how to deal with it
20:11.52TimRikerI think it's a good thing that we support multiple sql dbs. some folks will want to use phpmyadmin etc to get other views into the data. that's all good. I just don't happen to want that enough to deal with the little overhead it adds.
20:12.42troubledwhich comes back to the schema differences between the engines. most are index related I believe?
20:13.46troubledi dont mind if say mysql and pgsql have diff indexing and specific speed tweaks over sqlite resulting in different files so long as the schema are the same for all and dont alter the bots access to the data
20:14.12TimRikerdefault values or lack thereof is what I was noticing.
20:14.48troubleddefaults werent really needed since the bot doesnt miss anything on inserts
20:14.49TimRikerin mysql seen:  `nick` varchar(20) NOT NULL default '',
20:14.53TimRikerthat's just wrong.
20:15.12TimRikerthe point in having NOT NULL is that you can't add one without a nick. adding a default kills that.
20:15.23troubledshouldnt even be in there really. but when it was converted as was
20:15.34TimRikerexactly. read: bugs
20:15.40troublednod
20:16.07TimRikerif someone adds code that does not set nick, then mysql will happily allow that bug to exist.
20:16.25TimRikerand as we have 4 sets of sql setup scripts, someone won't notice.
20:16.27troubledi dont like "default" myself either
20:17.04TimRikerI'd strongly prefer that we find one set of sql setup scripts that works on all dbs.
20:17.04troubledwhen i have to edit, i generally once over all the diff engines. I think I missed a change once that simon caught, but its usually not a problem and hard to miss
20:17.22simonrvnTimRiker: and just specify which on the CL
20:17.29troubledI think the end goal was to put all the bootstrap stuff into that setup script
20:17.56TimRikeryou seem to be saying that we should never end up where we are. while that's a nice thought, we are here with the "default" lines that should not be there.
20:18.15troubledi dont like the idea that the user needs to know anything when they setup the bot, but I just havent got around to it
20:18.28troubledthen fix em :)
20:18.31TimRikermy point is that having one set of scripts would prohibit us from ending up here again. the dbs would have to be in sync, not just chance to be once in a while.
20:18.45simonrvnwell, they should know *something*
20:19.15troubledlike i was trying to say, when i made them, they were converted temp "as is" with only cosmetic changes, so as to not conflict with don's and other existing db's until we decided on specific changes
20:19.24troubledchanges like this "default" issue
20:19.30TimRikerthere used to be only one set till simonrvn split them up, if I'm reading the logs correctly.
20:19.40troubledi would have totaly change the schemas long time ago otherwise ;)
20:20.14troubledi think that was my patch that simonrvn applied for me since i didnt have svn commit yet
20:20.20simonrvnyeh
20:20.29TimRikercould be.
20:20.39simonrvni still have that patch
20:20.40troubledshould be like "troubled is da man" or something heh
20:20.55TimRikerwell in order to test merging them, i guess I need all 4 back ends setup. ugh.
20:21.05troubledthen he fubar'd and forget to create some files etc. needless to say a few revs later it was finalised heh
20:21.12TimRikerhehe
20:21.19simonrvn*grin*
20:22.02troubledTimRiker: i think the sqlite vs. sqlite2 was because of the way the old setup script locates the path (based on the conf var for db type)
20:22.30troubledwe could merge them, but then the setup script needs an edit so it treats sqlite and sqlite2 as the same path to schema
20:22.44troubledcause i dont think they actually differ
20:24.03TimRikertroubled: and do the others _need_ to differ?
20:24.18TimRikerI realise they do, but do they _need_ to.
20:24.30troubledya, pgsql and mysql arent compatible
20:24.37TimRikerlike the sqlite init does not use quotes, but it could, it just doesn't now.
20:25.09troubled' vs ` was a problem, as were the field types
20:25.23troubledvarchar vs. character varying, kinda stuff
20:25.30TimRikertroubled: you're telling me that with our simple brain dead schema that we can't come up with a version both mysql and pgsql like?
20:25.34simonrvnyes, i hate that ' ` crap :/
20:26.03troubledTimRiker: not that I am aware of. unless they both have some sql compliant versions of each hidden somewhere
20:27.13troubledtbh, it wasnt an issue with only mysql and sqlite. its when i had to add pgsql that it got screwy. and the method I used to take $DBTYPE for path to schema meant i needed to split everything. but a (sqlite|sqlite2) check could possibly let us merge the 2 sqlites
20:28.40TimRikertroubled: I'd hope to at least merge all but pysql
20:28.49troubledpgsql!
20:29.00TimRikers/pysql/pgsql/
20:29.09troubledbetter... ;p
20:30.50troubledreally though, seperate files shouldnt matter so long as _we_ know enough to keep them all compatible. far as a user is concerned, he should only have to run the setup file and it should just do the right thing (tm)
20:32.23troubledkinda coming back to the regression tests it would seem
20:34.15TimRikerwell, having fewer sets generally means less regression testing.
20:34.38TimRikerstill need some to catch things like mysql case insensitive indexes.
20:35.09troubledjust make the table a ci type
20:35.13troublednot cs
20:35.22troubledbut then sqlite would choke on it
20:35.38troubledcollation type iirc
20:35.47TimRikernever tried case insensitive in sqlite. might work.
20:36.14troubledsqlite supports collations?
20:36.19troubledsetting i mean
20:36.38TimRikerbut my point was the reverse. mysql is case insensive by default as the tables don't specify that. and sqlite is case sensitive by default.
20:38.17TimRikerhmm. do I have that backwards? that's the way I recall the bugs. It's been a while.
20:38.45troubled<PROTECTED>
20:39.36troubledmysql goes by the collation of the field. so ci for insensitive, and cs for sensitive. the index will do the right thing i believe
20:40.01troubledso right there, sqlite and mysql syntax would be incompatible
20:40.14*** part/#infobot Ttech (n=ttech@fullcirclemagazine/developer/ttech)
20:40.19simonrvnTimRiker: so did i
20:40.28TimRikerutf8, man use utf8.
20:40.57troubledtbh, im not sure if it was possible  with utf8. i would have to check
20:41.03troubleddont see why not though
20:41.04simonrvni meant "ughs" ;)
20:41.07TimRikermy.cnf:default-character-set = utf8
20:41.08TimRikermy.cnf:default-collation = utf8_unicode_ci
20:41.25simonrvni set mine that way also
20:41.40TimRikerthe 'one true way'
20:41.48troubledmight be a good idea to force it during table creation though for people that have it set conflicting
20:42.14simonrvna) it was putting it as latin1_swedish_ci (WTF?) b) UTF8 r0x my s0x
20:42.22TimRikerwell, we currently only index ascii (barring any odd tables troubled is playing with)
20:42.23troubledi think its still coming down to us needing to have seperate schemas though
20:42.52TimRikersimonrvn: mysql hq is in sweden. hence the default.
20:42.52troubledi dont know enough about latin1 vs. utf8/16 tbh
20:43.01TimRikerclearly. :)
20:43.08simonrvnoh, that explains it
20:43.29simonrvnascii first 127 are the same as utf8's
20:43.41troubled<PROTECTED>
20:43.44simonrvnthat was done on purpose in utf8
20:43.45TimRikerlatin1 can only express latin1. no jp, zh, kr, ru, gr, etc.
20:43.59troubledahh
20:44.08TimRikerutf8 can handle them all. one codeset to rule them all and in the darkness bind them.
20:44.16troubledgenerally i use utf8 though. just some leftovers around in my setup
20:44.22troubledlol
20:44.25simonrvnand extendable too
20:44.51TimRiker0-127 are the same for almost all of the codesets including utf8
20:45.09TimRikerit's 128-xxxxxxxxxx that differ
20:45.56TimRikerlatin1 is always one byte. but limited in what it can display. swedish uses it, but sorts if differently than english.
20:45.58simonrvnyup
20:46.19TimRikerutf8 starts with one byte but can be up to 6 bytes per character.
20:47.00simonrvnnon-spacing marks front and back, and a 4 byte char
20:47.20simonrvnor a combo
20:48.45TimRikernormal set only include up to 4 byte, and the 4 byte is hardly used, so really utf-8 is often just 1-3 byte.
20:49.09TimRikerbbiab
20:49.26troubleddo we have a sqlite->csv/mysql script?
20:55.45TimRikerhmm. I have one around someplace... isn't there a dump script in scripts/ that still works?
20:56.02troubledi see a few, havent tried em though
20:56.26TimRikersqlite <table> .dump    will get you an sql backup. chances are just the creates would need to change to get it into mysql.
20:56.27troubledwas gonna see if I could grab a copy of your db so I can put into mysql and propose some changes
20:58.30TimRiker12M here: http://rikers.org/tmp/20080315.bz2
20:58.38troubledill hack up a quick sqlite cli based bash script to stress test some stuff and do a before and after comparison. see how changes go
20:58.57troubledgive me a few to see how well the convert scripts work :)
20:59.24troubledwell, after the download finishes heh
20:59.29TimRikerthere's not much difference between dumps.
20:59.46troubledwhat do you mean?
20:59.51TimRikerjust set your bot to sqlite and run, to get a blank db.
20:59.56troubledbetween the sqlite dump and the perl one in infobot?
21:00.04TimRikerwell, I use sqlite2, but either should work.
21:00.39troubleddo you have access to mysql?
21:00.42TimRikerno, I mean my dumps over time.
21:01.05troubledya, i figure the changes are gradually slowing down
21:01.13TimRikeryeah, mysql5 was on ibot's server. don't recall if I shut it down or not, but it's trivial to get 4 or 5 running.
21:02.02troubledk. ill see if it converts ok to mysql/csv and put teh original and the altered one and send back the altered so we can bench some changes
21:02.31troubledid sooner do sqlite tests though. much each to bench, but more manual to edit the schema compared to a gui
21:03.17TimRikerI'm generally solely interested in run time overhead for the bot. size is irrelevant. and I don't plan on adding any log queries, so logging to db is not at all interesting to me.
21:03.41troubledim not after size either, i want lookup speed only
21:03.49TimRikerlookup for what?
21:03.59troubledanything in the db
21:04.28TimRikerk. well, "anything" is not what the bot normally does. I'm interested in what the bot normally does.
21:04.34troubledreads are more important than writes, so more specifically, im interested in any design changes that can give speed increases for reads
21:04.55TimRikernote: this does NOT include the speed for listkeys for example. I don't consider that a "normal" bot task.
21:05.06TimRikerbotmail, seen, factoid looksup, all normal.
21:05.54TimRikeragain, what kind of reads? all the normal queries are simple unjoined lookups.
21:05.58troubledthe dump's schema is the same as the one in the svn?
21:06.28TimRikerno. it's blootbot. so there are a few trivial differences. it will run with the svn bot though.
21:06.48troubledok, let me try import into sqlite as a start
21:08.03TimRikerk
21:08.09troubledbtw, is sqlite3 backward compatible with sqlite2? or do I need both?
21:08.18simonrvngo for 3
21:08.38TimRikerdb file format has changed.
21:08.48TimRikerdump/import should be the same.
21:09.10TimRikerso you need to use the command line tool that matches your installed perl modules.
21:09.34TimRikerI normally run the bot to get an empty table, then run sqlite against it. it'll complain if they differ.
21:11.59troubledya, i just tried the sqlite2 schemas and it didnt like the import
21:12.07troubledSQL error: factoids.created_time may not be NULL
21:12.10troubledSQL error: factoids.requested_by may not be NULL
21:12.46troubledabout 500 flew by, so there could be more problems too
21:13.20troubledso your db needs at least a few changes to be compliant with our HEAD copy of sqlite2 schema
21:14.16troubledwell, let me fudge the schema a bit so I can at least get it imported, then go from there shall we :)
21:15.57troublednews and uptime.sql for sqlite2 dont import proper either, even though we dont use em yet
21:16.44troubledsqlite doesnt seem to like #'s in the schema for comments *shrug*
21:17.02troubledalthough i think the bot was change to parse them out as a kludge
21:17.06TimRikeryep. need to reload the table myself to cleanup issues like that.
21:17.57TimRiker#'s in sql are not standard. they are a mysql extension.
21:18.06TimRikerproper sql is --- if you can believe that.
21:18.23simonrvnisn't it -- ?
21:19.13troubledi didnt realize your dump had the schema in it though, but its good we caught that i guess
21:19.28troubledya -- iirc
21:20.29troubledalthough technically -- + anything would be a comment
21:20.42troubledbah, let me do this on my wks :)
21:20.52simonrvnyeh --
21:20.57TimRikeroh, yeah. --
21:24.31troubledTimRiker: might want to take care of all your null fields though before you convert
21:32.52TimRikertroubled: yep, planning on it.
21:34.28TimRikerthough I may change the schema as well. force the code to set what should be set, rather than forcing it in the schema. have not decided.
21:35.16TimRikerrequested_by VARCHAR(64) NOT NULL DEFAULT 'nobody',   makes sense, but better to insert nobody when creating the factoid imho. thoughts?
21:35.35troubledi think thats already done for some reason
21:35.42troubledmaybe not everywhere though
21:35.52TimRikerexactly.
21:36.08TimRikerremoving the default exposes the "not everywhere" cases.
21:36.32troubledlets agree to remove all "default" from all schemas as a start then?
21:36.34TimRikeror if we are going to keep the default, then we should remove the ones that do specify nobody. etc.
21:37.11TimRikerI tend towards the no-defaults in the schema as it makes it easier to force the code to be consistant.
21:37.23troubledworks for me
21:37.38TimRikerhowever it also means more places to touch if we decide that "nobody" should be a valid nic now and want something else there.
21:37.43troubledbut it may break trunk for a bit til we test each engine
21:38.25TimRikerneed to walk softly where db changes are concerned.
21:39.02troubledi say we just make all the changes we want done and be done with it tbh. all the db's are already out of whack (yours, and dons namely)
21:39.31*** join/#infobot simonrvn (i=simon@unaffiliated/simonrvn)
21:39.31*** mode/#infobot [+v simonrvn] by ChanServ
21:39.34troublednot sure who else has has db's though that could be bitten
21:43.42troubledsqlite> select avg(length(locked_by)) from factoids;
21:43.44troubled35.5281690140845
21:43.47troubledsqlite> select avg(length(created_by)) from factoids;
21:43.49troubled40.0837810684489
21:45.10troubledthats tons bigger than say a 4bit pkey id. not to say that i would go as far as splitting it up into 3 seperate fields, but at least moving the line as a whole might have a bit of an effect. let me see if I can get a group by for count or something in sqlite
21:48.54troubledsqlite> select created_by,count(created_by) from factoids group by created_by order by count(created_by) desc limit 5;
21:48.57troubled|242
21:48.59troubledhoxu!hoxu@dsl-jklgw3i1c.dial.inet.fi|226
21:49.02troubledgreycat!~wooledg@imadev.eeg.ccf.org|197
21:49.04troubledlocalhost!root@a1-3c045.neo.rr.com|145
21:49.07troubledgreycat!wooledg@imadev.eeg.ccf.org|126
21:49.09troubled242 nulls :/
21:50.27troubledso out of the ~114k, there are at least a few thousand that are dupes. multipled by avg length times the 2 or 3 fields that have potentitial host info, that could drop table size a enough to give faster lookups. and some index's on them would really kick it into high gear
21:51.03troubledlots of 50 ish dupes
21:51.15troubledgreycat!wooledg@imadev.eeg.ccf.org|126
21:51.17troubledgreycat!~wooledg@192.35.79.70|99
21:51.56troubledand splitting the field further into seperate nick, ident and hosts would save a tad for nick dupe where the ip changes (which is most common change)
21:52.12troubledbut a slight cost of over head for joins on 3 tables instead of 1
21:52.38troubledim leaning towards a simple 1 table though. less code change and probably the best increase in efficiency we can hope for
21:54.32troubledmaybe just a simple "users" or "nuh" table for storage of nick!user@host for the entire set of tables to cut down on things a bit. ill see if I can manage to manually test that change on your db set and see what kinda savings and speed difference we are looking at. hyptohetical only gets you so far. but 4byte times a few spots is better than 64bytes times the same number of spots
21:54.52troubledwhich in turn cuts down on the...aparently non existant rofl....indexes
21:58.49troubledsqlite doesnt support views does it?
21:59.32troubledwould be the simplest was to avoid changing any code for db changes while increasing efficiency
22:03.32TimRikerfixing a bunch of nulls. I put up another. you can nuke the old one.
22:03.59troubledupdate factoids set created_by = "default" where created_by = null;?
22:04.05troubledthat kinda thing?
22:04.22troubledquicker to just match the sql probably :)
22:04.52troubledsame url though? probably wont take long to download again
22:05.37troubledill try import it with the svn schema for sqlite2
22:06.27TimRikeryeah, over different tables.
22:06.31troubled18:06:17 ERROR 404: Not Found.
22:06.38TimRikernot there yet.
22:06.42troubledah
22:08.16TimRikerk. there now
22:08.27troubled<PROTECTED>
22:08.31troubledstill 404
22:08.58TimRikerhmm. didn't get em all..
22:09.31TimRikerhttp://rikers.org/tmp/20080315.bz2 ?
22:09.32troubledthis machine is using the proxy though, so could be that causing a problem. first download was from the gateway
22:10.40troubledthere we go, --no-proxy fixed it *shrug*
22:11.00troubledprobably cache the 404
22:11.04troubledcached*
22:16.03troubledsqlite> select count(*) from factoids where created_by is null;
22:16.05troubled91621
22:16.40troubledthat cant be right. created_by = null returned 0 :/
22:17.10troubledsqlite> select count(*) from factoids;
22:17.27troubled114514
22:17.27troubledsqlite> select count(distinct(created_by)) from factoids;
22:17.27troubled9762
22:19.05TimRikerk. well still working on it.
22:19.08troubledwell, we could save 100k rows worth in the table for created_by alone by using a fkey'd users table
22:19.45troublednot so much space, but since we dont have index atm, thats a sequential lookup. and 10k is faster than 100k. although not by much on a fast machine
22:20.45TimRikercreated_by can be null in the sqlite svn sql
22:21.01troubledso proposed would be an index'd 10k user table vs. a sequential 114k table currently. should give a few ms of speed
22:21.24troubledsure it can be, but the code ensures that its always set im pretty sure
22:21.42TimRikeryou're forgetting the extra join, which will likely kill any speed saves.
22:21.45troubledthe null allowed was only cause of older db's like yours with them before that code was introduced
22:22.05TimRikeryeah, I'll likely set em all to nobody
22:22.39troubledwell, nulls are fine tbh. smaller data
22:23.01troubledand if you are going to set it to a word, length("null") < length("nobody") :p
22:23.28troubledbesides, I think the bot code already knows how to handle nulls doesnt it?
22:24.32troubledinfobot: literal working fine?
22:24.37troubledinfobot: factinfo working fine?
22:24.37infobotthere's no such factoid as working fine?, troubled
22:24.37TimRikerI forget how we handle factoid locks. unlocked should have 0 time? null locker? both? either?
22:24.56TimRikerinfobot: literal slashdot
22:24.56infobot"slashdot" is "http://slashdot.org, or they're so lame they had to interview unknown_lamer for more content"
22:25.26troubledinfobot: factinfo bitkeeper
22:25.26infobotbitkeeper -- it has been requested 92 times, last by prpplague at Tue Apr  4 12:56:47 2006.
22:25.35troubledinfobot: factinfo debian
22:25.35infobotdebian -- created by atoponce <n=aaron@ubuntu/member/pdpc.supporter.bronze.atoponce> at Fri Sep  7 01:57:55 2007 (190 days); last modified by nobody; it has been requested 13 times, last by gutz|work, 3h 31m 54s ago.
22:25.42troubledya, nulls are handled
22:26.03troubledprobably best to keep em that way. saves on space too. and smaller the table the better
22:27.37troubledya, seems non 0 locktime means locked
22:30.04TimRikerwell, we ought to be able to handle request_by is NULL too then, no?
22:30.06troubledsweet, sqlite supports views :)
22:30.23troubledwe can technically change the db at least without code tweaks
22:30.48troubledhmm, wait, im better it cant do inserts. usually they can provided only 1 foreign table though
22:31.09TimRikerwhat happens if you insert into a view where nothing exists? does it make new records in all the linked tables?
22:31.09troubledi think NULL's are handled for everything
22:31.22troubledbest way to check is to look up a few null records and check them here
22:32.18troubledwell, the problem with a view is that unless you have triggers, views can only basiclaly insert to the same table. which limits the view to itself with save a "where not visible" type selects
22:33.05troubledso we would have to update the bot code to deal with them. not a big problem though, just take some time to parse out where all the sql is and fix it to do the inserts different
22:33.44troubledthe way I layed it out in my db is the data goes to the normal table, but selects all go from a "view_$NAME" table
22:34.38troublednow, for speed, we could cache id's of stuff we already looked up, but the sqlite lib should handle that for us and it wouldnt break simultaneous updates.
22:35.02troubledif we cache id's and the db id changes by something else, the bot would trash the db
22:35.37troubledand since we dont have any kind of referencial integrity in the db for sqlite (that im aware of), its all on us to ensure in the bot
22:36.53TimRiker~forget ! did i just "hear" you say the the xfbdev
22:36.53infobotTimRiker: i forgot ! did i just "hear" you say the the xfbdev
22:37.30TimRiker~bitkeeper
22:37.31infobotit has been said that bitkeeper is at http://www.bitkeeper.com/ or at http://www.bitmover.com/ or hosted trees at http://www.bkbits.net/ or like cvs and subversion, but its non-opensource.. but its really good, even linus uses it to develop the kernel!
22:37.45TimRiker~"bitkeeper
22:37.45infobotit has been said that "bitkeeper is written by kernel developers, for kernel developers" hmm.
22:38.12TimRiker~factinfo "cpu"
22:38.12infobotTimRiker: no extra info on "cpu"
22:38.28TimRiker~forget "cpu"
22:38.28infobotTimRiker: i forgot "cpu"
22:38.32troubledalso, one technique I use for my logger repo is to provide sql incremental updates and a full schema. would require some documentation, but at least we can get a db schema we like, then carry it over time. no more excuses for out of date dbs. might even be able to setup an automated update system at some point
22:40.31CIA-26infobot: 03timriker * r1639 10/trunk/setup/sqlite2/ (factoids.sql seen.sql): sqlite2 changes
22:41.09TimRikeryep. need to do that at some point.
22:41.47troubledalso, 64 is too small for nick!user@host. should be like "16!8@63" = 89, or for up to 32 char nick = 105
22:42.29troubled16!8@63 is freenode though, and 89 (or even 90 or 100) should be more than enough
22:43.17troubledalso, we get more speed at the cost of extra storage by using fixed fields instead of varying
22:43.51troubledwould negate the space saved a tad from the normalizing, but the speed from it all should be worth it. what do you think?
22:44.28troubledwhat I mean is a forgeign table wuth fixed sized fields would be well worth the extra storage
22:44.40troubledtoss and index on top and we should be smokin
22:46.26CIA-26infobot: 03timriker * r1640 10/trunk/setup/sqlite2/ (connections.sql news.sql uptime.sql): sql comments
22:48.21TimRikerI'm still a complete non-fan of normalization. I'm not convinced that any of them are appropriate.
22:49.15troubledwhat is the sqlite equiv of a auto increment/serial type?
22:50.03troubledthats ok, ill just be testing and giving you benchmark results so we can see for sure just how much of a diff it makes and if its worth it
22:50.15troubledconsider all this R&D :)
22:50.41troubledthats why im just changing the db, not the code yet. and if I do, ill put the proposed in a branch anyways
22:57.27TimRikerwell, gotta head home. back on later. I hope to get the bot moved over today.
22:57.35TimRikercya

Generated by irclog2html.pl Modified by Tim Riker to work with infobot.