04:27.00 | *** join/#openjtag ka6sox (n=ka6sox@nslu2-linux/ka6sox) |
06:59.45 | *** join/#openjtag _AchiestDragon (n=dave@whipy.demon.co.uk) |
07:20.51 | *** join/#openjtag AchiestDragon (n=dave@whipy.demon.co.uk) |
09:49.25 | Griffon26 | vmaster: you around? |
09:56.29 | Griffon26 | I'm trying to do some JTAG speed calculations to see if I understand the influence of latency correctly. |
09:56.55 | Griffon26 | If anyone has some numbers I could use to verify the calculations, please let me know. |
10:35.05 | *** join/#openjtag bullet (n=bullet@226-68.1-85.cust.bluewin.ch) |
10:35.42 | *** join/#openjtag vmaster_ (i=vmaster@p549B6733.dip.t-dialin.net) |
10:43.38 | vmaster_ | Griffon26: I'm here |
10:44.03 | Griffon26 | I'm trying to match my calculations with what you told me about your transfer speeds |
10:45.11 | Griffon26 | you said you could transfer 54 bytes in one go |
10:45.44 | Griffon26 | for those you'd have to wait for an ack, so that's 2 * latency |
10:46.17 | Griffon26 | this was USB1.x, so max 12Mbps and 1ms latency |
10:46.41 | Griffon26 | and you got a speed of several 100s K/s? |
10:48.03 | vmaster_ | there are two ways to transfer data on an ARM7/9 |
10:49.37 | vmaster_ | one allows me to transfer 54 byte at a time, the other transfers only 4 byte at a time, but i continue without waiting for an ack |
10:51.29 | vmaster_ | i'll just run some tests to give you definite numbers |
10:51.38 | Griffon26 | ok, great! thanks |
11:02.16 | vmaster_ | okay: 100% save transfers: ~25kb/s |
11:02.52 | vmaster_ | same method, but without waiting for the target: ~50kb/s |
11:03.06 | vmaster_ | different method, with a small handler running on the core to accept the data: ~100kb/s |
11:03.41 | vmaster_ | with fewer processes running i get about 20% better results |
11:06.32 | Griffon26 | and your jtag adapter is manually toggling the clock pin, right? |
11:06.46 | Griffon26 | I mean, the clock is also coming from USB data? |
11:11.52 | Griffon26 | assuming a JTAG overhead of a factor of 30, I come to 26K/s with 1ms latency and 51K/s with 0ms latency (both USB1.x). But I may be over simplifying. |
11:12.19 | Griffon26 | I'm not taking into account any bottlenecks in the other parts of the probe and a JTAG overhead of 30 seems excessive. |
11:19.17 | vmaster_ | i'm using the MPSSE |
11:19.29 | vmaster_ | so the clock is toggled for every bit shifted to TMS or TDI/TDO |
11:20.13 | Griffon26 | ah, that's good |
11:20.32 | vmaster | you'll also have to build large queues of commands |
11:21.27 | vmaster | because every scan you send takes at least one USB frame |
11:22.07 | Griffon26 | that means 54 bytes transferred per frame, right? |
11:23.43 | Griffon26 | oh wait.. I found an error in my calcs |
11:23.45 | vmaster | 54 byte of payload |
11:24.01 | vmaster | but that's a lot more scans |
11:24.06 | Griffon26 | yes, I know |
11:25.13 | Griffon26 | I counted only 1 latency time |
11:25.21 | Griffon26 | should be 2 I think, per block |
11:27.21 | vmaster | mhh, you can't calculate the latency that easily, i guess |
11:27.34 | Griffon26 | it's max you mean |
11:27.54 | vmaster | yeah, it's just an upper bound |
11:28.42 | Griffon26 | well, it's safe to say that the latency of the reply is just added in full.. for the first one I should do some rounding up |
11:32.26 | Griffon26 | hmm.. you can always send 54 bytes within one microframe, so latency is the only thing that influences speed |
11:32.45 | Griffon26 | that can't be right |
11:37.54 | Griffon26 | if you have 1ms microframes, if 54 bytes fit in one frame (incl overhead) @ 12Mbps, if you wait for ack -> 54 bytes take 2 ms -> 26K/s. Huh? |
11:41.35 | vmaster | http://mmd.ath.cx/usb_latency.log |
11:42.06 | vmaster | that's the log output from my openocd, when writing 128kb using the 100% safe method to an ARM7TDMI-S |
11:42.44 | vmaster | the TCK frequency was limited to 2 MHz |
11:45.53 | vmaster | 3 MHz would be possible, too, but don't improve the performance, and the arm7tdmi-s can't take more than 1/6th of it's core frequency |
11:46.00 | vmaster | which might be 14mhz at startup |
12:20.50 | vmaster | the actual writing starts at line 6222, after a reset, some initializations and querying the boards flash |
12:21.26 | vmaster | the Info: ... 6293,1 4% |
12:21.37 | vmaster | ftd2xx_execute_queue() lines |
12:21.48 | vmaster | display the USB latency |
12:22.07 | vmaster | inter is right after handing the buffer to the FTDI lib |
12:23.17 | vmaster | inter2 could actually be removed, the code between inter and inter2 isn't enabled on current builds |
12:23.41 | vmaster | and end is right after FT_Read returned the requested data (including the ACK) |
12:25.43 | Griffon26 | I'd say printing is incorrect |
12:26.10 | Griffon26 | I think the fractional part is leaving out leading zeroes |
12:26.42 | vmaster | mhh, it's seconds.microseconds |
12:26.53 | Griffon26 | that's what I figured |
12:28.34 | vmaster | yeah, it isn't meant to be a float, and sec: %i usec: %i might be more appropriate, but I haven't enabled that output for months |
12:30.23 | Griffon26 | ok, it looks like 2ms on average, which would make sense |
12:31.50 | Griffon26 | now if I assume a JTAG overhead factor of 29, I come to 53K/s if you're not waiting for acks and it also approaches the theoretical maximum of USB1.1 when using 54 bytes/block |
12:32.36 | Griffon26 | if I look at USB2.0, the optimum bytes per block would be around 270 and the speed with acks would be about 1Mbyte/s and without acks 2Mbyte/s |
12:35.36 | vmaster | writing 56 bytes takes 1065 tck cycles, so the overhead is only 2.37 |
12:36.24 | Griffon26 | how do you figure? It must be at least 8 because of bytes->bits, right? |
12:36.38 | vmaster | oh well, 2.37 tck's per bit |
12:38.18 | vmaster | i calculated the number of tcks a while ago |
12:39.22 | vmaster | the 50-60 kb/s when not waiting for an ack seems to be caused by the FT2232C itself, not the USB comms |
12:39.30 | Griffon26 | ok, I'm lost. I thought nothing much was done by the hardware and that you wrote one byte to set the state of the pins for one TCK cycle |
12:39.58 | vmaster | did you have a look at the FT2232C MPSSE appnote? |
12:40.04 | Griffon26 | I don't think so |
12:42.12 | vmaster | http://www.ftdichip.com/Documents/AppNotes/AN2232C-01_MPSSE_Cmnd.pdf |
12:42.27 | Griffon26 | ok |
12:46.34 | vmaster | the command byte specifies on which edge that should be read/written, if data is transmitted lsb/msb first, whether it's bits or bytes and what lines should be written |
12:47.04 | vmaster | the next byte specifies the length |
12:47.16 | vmaster | followed by data bytes |
12:56.28 | Griffon26 | oh, 6MHz max |
12:58.52 | Griffon26 | it's a nice, the MPSSE, but it's just too slow |
13:19.42 | vmaster | i guess you can only come close to the 6MHz if all you do is writes of a very large number of bytes (max is 65536) |
13:20.59 | vmaster | of course the mpsse isn't as fast as i'd like it to be, but it's a very convenient solution, and quite cheap, too |
13:21.47 | vmaster | we've discussed alternatives here every now and then, and AchiestDragon is even working on a PCB, but this is going to be very complex |
13:24.42 | vmaster | at higher speeds, the optimistic strategy ("the core is going to be fast enough") is more likely to fail, so you'll have to wait for the ACKs |
13:25.16 | vmaster | and then latency kicks in, and even USB 2.0 hi-speed with its 125us is going to set your limit |
13:25.48 | vmaster | so you'll have to move the ACK detection and waiting into the device |
13:27.30 | vmaster | which means either you define a generic jtag protocol (like mpsse, but more like a script language e.g. STAPL), or you make your probe target dependent |
13:30.32 | Griffon26 | what if you use an FPGA? |
13:31.05 | vmaster | that's what AchiestDragon is working on |
13:31.10 | Griffon26 | oh, ok |
13:31.11 | vmaster | iirc an ARM9 + FPGA |
13:31.21 | vmaster | but it's still very complex |
13:31.36 | Griffon26 | what is doing the USB stuff? the fpga? |
13:32.11 | vmaster | sorry, don't know |
14:36.25 | *** join/#openjtag bullet (n=bullet@120.62.62.81.cust.bluewin.ch) |
16:29.49 | *** join/#openjtag bullet (n=bullet@121-69.0-85.cust.bluewin.ch) |
20:54.49 | *** join/#openjtag toi (n=pleemans@d5152D12D.access.telenet.be) |