irclog2html for #openjtag on 20060716

04:27.00*** join/#openjtag ka6sox (n=ka6sox@nslu2-linux/ka6sox)
06:59.45*** join/#openjtag _AchiestDragon (n=dave@whipy.demon.co.uk)
07:20.51*** join/#openjtag AchiestDragon (n=dave@whipy.demon.co.uk)
09:49.25Griffon26vmaster: you around?
09:56.29Griffon26I'm trying to do some JTAG speed calculations to see if I understand the influence of latency correctly.
09:56.55Griffon26If anyone has some numbers I could use to verify the calculations, please let me know.
10:35.05*** join/#openjtag bullet (n=bullet@226-68.1-85.cust.bluewin.ch)
10:35.42*** join/#openjtag vmaster_ (i=vmaster@p549B6733.dip.t-dialin.net)
10:43.38vmaster_Griffon26: I'm here
10:44.03Griffon26I'm trying to match my calculations with what you told me about your transfer speeds
10:45.11Griffon26you said you could transfer 54 bytes in one go
10:45.44Griffon26for those you'd have to wait for an ack, so that's 2 * latency
10:46.17Griffon26this was USB1.x, so max 12Mbps and 1ms latency
10:46.41Griffon26and you got a speed of several 100s K/s?
10:48.03vmaster_there are two ways to transfer data on an ARM7/9
10:49.37vmaster_one allows me to transfer 54 byte at a time, the other transfers only 4 byte at a time, but i continue without waiting for an ack
10:51.29vmaster_i'll just run some tests to give you definite numbers
10:51.38Griffon26ok, great! thanks
11:02.16vmaster_okay: 100% save transfers: ~25kb/s
11:02.52vmaster_same method, but without waiting for the target: ~50kb/s
11:03.06vmaster_different method, with a small handler running on the core to accept the data: ~100kb/s
11:03.41vmaster_with fewer processes running i get about 20% better results
11:06.32Griffon26and your jtag adapter is manually toggling the clock pin, right?
11:06.46Griffon26I mean, the clock is also coming from USB data?
11:11.52Griffon26assuming a JTAG overhead of a factor of 30, I come to 26K/s with 1ms latency and 51K/s with 0ms latency (both USB1.x). But I may be over simplifying.
11:12.19Griffon26I'm not taking into account any bottlenecks in the other parts of the probe and a JTAG overhead of 30 seems excessive.
11:19.17vmaster_i'm using the MPSSE
11:19.29vmaster_so the clock is toggled for every bit shifted to TMS or TDI/TDO
11:20.13Griffon26ah, that's good
11:20.32vmasteryou'll also have to build large queues of commands
11:21.27vmasterbecause every scan you send takes at least one USB frame
11:22.07Griffon26that means 54 bytes transferred per frame, right?
11:23.43Griffon26oh wait.. I found an error in my calcs
11:23.45vmaster54 byte of payload
11:24.01vmasterbut that's a lot more scans
11:24.06Griffon26yes, I know
11:25.13Griffon26I counted only 1 latency time
11:25.21Griffon26should be 2 I think, per block
11:27.21vmastermhh, you can't calculate the latency that easily, i guess
11:27.34Griffon26it's max you mean
11:27.54vmasteryeah, it's just an upper bound
11:28.42Griffon26well, it's safe to say that the latency of the reply is just added in full.. for the first one I should do some rounding up
11:32.26Griffon26hmm.. you can always send 54 bytes within one microframe, so latency is the only thing that influences speed
11:32.45Griffon26that can't be right
11:37.54Griffon26if you have 1ms microframes, if 54 bytes fit in one frame (incl overhead) @ 12Mbps, if you wait for ack -> 54 bytes take 2 ms -> 26K/s. Huh?
11:41.35vmasterhttp://mmd.ath.cx/usb_latency.log
11:42.06vmasterthat's the log output from my openocd, when writing 128kb using the 100% safe method to an ARM7TDMI-S
11:42.44vmasterthe TCK frequency was limited to 2 MHz
11:45.53vmaster3 MHz would be possible, too, but don't improve the performance, and the arm7tdmi-s can't take more than 1/6th of it's core frequency
11:46.00vmasterwhich might be 14mhz at startup
12:20.50vmasterthe actual writing starts at line 6222, after a reset, some initializations and querying the boards flash
12:21.26vmasterthe Info: ...                                                                                                                          6293,1         4%
12:21.37vmasterftd2xx_execute_queue() lines
12:21.48vmasterdisplay the USB latency
12:22.07vmasterinter is right after handing the buffer to the FTDI lib
12:23.17vmasterinter2 could actually be removed, the code between inter and inter2 isn't enabled on current builds
12:23.41vmasterand end is right after FT_Read returned the requested data (including the ACK)
12:25.43Griffon26I'd say printing is incorrect
12:26.10Griffon26I think the fractional part is leaving out leading zeroes
12:26.42vmastermhh, it's seconds.microseconds
12:26.53Griffon26that's what I figured
12:28.34vmasteryeah, it isn't meant to be a float, and sec: %i usec: %i might be more appropriate, but I haven't enabled that output for months
12:30.23Griffon26ok, it looks like 2ms on average, which would make sense
12:31.50Griffon26now if I assume a JTAG overhead factor of 29, I come to 53K/s if you're not waiting for acks and it also approaches the theoretical maximum of USB1.1 when using 54 bytes/block
12:32.36Griffon26if I look at USB2.0, the optimum bytes per block would be around 270 and the speed with acks would be about 1Mbyte/s and without acks 2Mbyte/s
12:35.36vmasterwriting 56 bytes takes 1065 tck cycles, so the overhead is only 2.37
12:36.24Griffon26how do you figure? It must be at least 8 because of bytes->bits, right?
12:36.38vmasteroh well, 2.37 tck's per bit
12:38.18vmasteri calculated the number of tcks a while ago
12:39.22vmasterthe 50-60 kb/s when not waiting for an ack seems to be caused by the FT2232C itself, not the USB comms
12:39.30Griffon26ok, I'm lost. I thought nothing much was done by the hardware and that you wrote one byte to set the state of the pins for one TCK cycle
12:39.58vmasterdid you have a look at the FT2232C MPSSE appnote?
12:40.04Griffon26I don't think so
12:42.12vmasterhttp://www.ftdichip.com/Documents/AppNotes/AN2232C-01_MPSSE_Cmnd.pdf
12:42.27Griffon26ok
12:46.34vmasterthe command byte specifies on which edge that should be read/written, if data is transmitted lsb/msb first, whether it's bits or bytes and what lines should be written
12:47.04vmasterthe next byte specifies the length
12:47.16vmasterfollowed by data bytes
12:56.28Griffon26oh, 6MHz max
12:58.52Griffon26it's a nice, the MPSSE, but it's just too slow
13:19.42vmasteri guess you can only come close to the 6MHz if all you do is writes of a very large number of bytes (max is 65536)
13:20.59vmasterof course the mpsse isn't as fast as i'd like it to be, but it's a very convenient solution, and quite cheap, too
13:21.47vmasterwe've discussed alternatives here every now and then, and AchiestDragon is even working on a PCB, but this is going to be very complex
13:24.42vmasterat higher speeds, the optimistic strategy ("the core is going to be fast enough") is more likely to fail, so you'll have to wait for the ACKs
13:25.16vmasterand then latency kicks in, and even USB 2.0 hi-speed with its 125us is going to set your limit
13:25.48vmasterso you'll have to move the ACK detection and waiting into the device
13:27.30vmasterwhich means either you define a generic jtag protocol (like mpsse, but more like a script language e.g. STAPL), or you make your probe target dependent
13:30.32Griffon26what if you use an FPGA?
13:31.05vmasterthat's what AchiestDragon is working on
13:31.10Griffon26oh, ok
13:31.11vmasteriirc an ARM9 + FPGA
13:31.21vmasterbut it's still very complex
13:31.36Griffon26what is doing the USB stuff? the fpga?
13:32.11vmastersorry, don't know
14:36.25*** join/#openjtag bullet (n=bullet@120.62.62.81.cust.bluewin.ch)
16:29.49*** join/#openjtag bullet (n=bullet@121-69.0-85.cust.bluewin.ch)
20:54.49*** join/#openjtag toi (n=pleemans@d5152D12D.access.telenet.be)

Generated by irclog2html.pl by Jeff Waugh - find it at freshmeat.net! Modified by Tim Riker to work with blootbot logs, split per channel, etc.