CN1076838C - Transpose Memory for Discrete Cosine Transform/Inverse Discrete Cosine Transform Circuit - Google Patents

Transpose Memory for Discrete Cosine Transform/Inverse Discrete Cosine Transform Circuit Download PDF

Info

Publication number
CN1076838C
CN1076838C CN94109115A CN94109115A CN1076838C CN 1076838 C CN1076838 C CN 1076838C CN 94109115 A CN94109115 A CN 94109115A CN 94109115 A CN94109115 A CN 94109115A CN 1076838 C CN1076838 C CN 1076838C
Authority
CN
China
Prior art keywords
matrix
memory
mentioned
elements
counter
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN94109115A
Other languages
Chinese (zh)
Other versions
CN1122026A (en
Inventor
张翊峰
高进南
黄柏川
Original Assignee
Industrial Technology Research Institute ITRI
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Industrial Technology Research Institute ITRI filed Critical Industrial Technology Research Institute ITRI
Priority to CN94109115A priority Critical patent/CN1076838C/en
Publication of CN1122026A publication Critical patent/CN1122026A/en
Application granted granted Critical
Publication of CN1076838C publication Critical patent/CN1076838C/en
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Images

Landscapes

  • Complex Calculations (AREA)

Abstract

A transpose memory includes four dual-port memories, a first counter for writing elements into the dual-port memories, and a second counter for reading elements from the dual-port memories. If the received matrix is to be output to the first type switching circuit, the first counter writes each received matrix element into a specific memory corresponding to a quarter quadrant of the matrix. If the received matrix is to be output to the second type conversion circuit, the first counter writes the received matrix elements into a specific dual port memory corresponding to a column and a row of the matrix in either the singular or the even number.

Description

The transpose memory of discrete cosine transform/anti-discrete cosine change-over circuit
The present invention relates to one and assign into two transpose memories between the one dimension change-over circuit.It can be used on the circuit of positive discrete cosine transform (Discrete CosineTransform-DCT) or anti-discrete cosine conversion (Inverse DCT).Thus, it is compressed with usefulness to signal of video signal especially.
Positive discrete cosine transform is widely used in signal of video signal compression and processing.The positive discrete cosine transform of two dimension is defined as follows: Z mn = 4 N 2 C ( m ) c ( n ) Σ i = 0 N - 1 Σ j = 0 N - 1 X ij cos ( 2 i + 1 ) mII 2 N cos ( 2 j + 1 ) nII 2 N - - - - - ( 1 ) X wherein IjBe the input element i of a N * N matrix, j=0,1,2 ..., N-1, and z MnBe the output element of a N * N matrix, m, n=0,1,2 ..., n-1.Wherein Anti-discrete cosine then is defined as: X xj = Σ m = 0 N - 1 Σ n = 0 N - 1 c ( m ) c ( n ) Z mn cos ( 2 i + 1 ) mII 2 N cos ( 2 j + 1 ) nII 2 N - - - - - ( 2 ) The matrix form of equation (1) is
Z=CXC t(3) wherein X is the input data matrix, and C is the cosine transition matrix, and C tIt is the transposition of Matrix C.If X is a N * N matrix, then equation (3) can be expressed as: Z 11 Z 12 . . . Z 1 N Z 21 Z 22 . . . Z 2 N . . Z N 1 Z N 2 . . . Z NN C 11 C 12 . . . C 3 N C 21 C 22 . . . C NN . . C N 1 C N 2 . . . C NN X 11 X 12 . . . X 1 N X 21 X 22 . . . X 2 N . . X N 1 X N 2 . . . X NN C 11 C 21 . . . C N 1 C 11 C 22 . . . C N 2 . . C 1 N C 2 N . . . C NN - - - - ( 4 ) The very similar positive discrete cosine transform of computing of anti-Discrete Slant string conversion is except C and C tExchange.Therefore can the Dan Yizheng discrete cosine transform discuss.Equation (3) can be divided into following one dimension conversion
Y=CX (5)
Z=YC t(6) wherein Y is a product matrix placed in the middle, and C tIt is the transposition of Matrix C.Lead from equation (6), following equation is similarly effective:
Z=CY t(6a) Y wherein tTransposition for matrix Y.If the anti-discrete cosine conversion, we then have:
X=C tY or X=Y tThe many tradition of C (6b) are used the positive discrete cosine transform technique or the circuit of input data matrix, as " discrete cosine transform " of N.Ahmed, and IEEE Trans.on comp., vol.C-23, Jan., 1974, the positive discrete cosine transform technique of property as 90-93 proposes; N.Cho ﹠amp; S.Lee, " fast algorithm that 2-D discrete cosine changes and enforcement (Fast Algorithm and Implementation of 2-D DCT "), IEEETrans.on Cir.﹠amp; Sys., Vol.38, No.3, Mar.1991,297-305 propose N positive discrete cosine transform circuit of one dimension of utilization or multiplexer, calculate a positive discrete cosine transform of two-dimentional N * N with N/2 the positive discrete cosine transform circuit of one dimension; M.Sun et al., " VLSI Implementation of a 16 * 16 DCT ", IEEETrans.on Cir.﹠amp; Sys., Vol.36, No.4, Apr.1989,610-17 propose the positive discrete cosine transform of two dimension that a use storer is stored the positive discrete cosine transform matrix operation of one dimension partial results; H.Hou, " calculate the quick recursive algorithm (A Fast Recursive Algorithm for Computing theDCT) of discrete cosine transform ", IEEE Trans.on Acoustic, Speech and SignalProcessing, Vol.ASSP-35, No.10,1987,1455-61 utilizes the symmetry of just discrete string conversion coefficient matrix to reduce the quantity of multiplying.Equation (5) can be write as: Y o Y e = C 1 C 1 C 2 - C 2 X f X r - - - ( 7 ) Y wherein eBe the matrix that contains matrix Y even number line, Y oBe the matrix that contains matrix Y odd-numbered line, X fBe the matrix that contains the matrix X first half capable (upper half rows), X rBe the matrix that contains matrix X Lower Half row, C 1And C 2Be
Figure C9410911500081
Positive discrete cosine transform matrix of coefficients.For example, with one 8 * 8 positive discrete cosine transform, equation (7) can be written as: Y 1 Y 3 Y 5 Y 7 D E F G E - G - D - F F - D G E G - F E - D X 0 - X 7 X 1 - X 6 X 2 - X 5 X 3 - X 4 - - - Y 0 Y 2 Y 4 Y 6 = A A A A B C - C - B A - A - A A C - B B - C X 0 + X 7 X 1 + X 6 X 2 + X 5 X 3 + X 4 - - - ( 7 a ) A=cos (π/4) wherein, B=cos (π/8), C=sin (π/8), D=cos (π/16), E=cos (3 π/16), F=sin (3 π/16), G=(sin π/16).Y 0, Y 1, Y 2, Y 7Be the row of matrix Y, and X 0, X 1..., X 7Row for input data matrix X.Thus, the anti-discrete cosine conversion can be written as: X f X r = C 1 t C 1 t C 1 t - C 2 t Y o Y e - - - ( 8 ) Wherein
Figure C9410911500084
Be C 1Conversion and
Figure C9410911500085
Be C 2It is conversion.For example, with one 8 * 8 positive discrete cosine transform, equation (8) can be written as: X 0 X 1 X 2 X 3 = A A A A B C - C - B A - A - A A C - B B - C Y 0 Y 2 Y 4 Y 6 + D E F G E - G - D - F F - D G E G - F E - D Y 1 Y 3 Y 5 Y 7 X 7 X 6 X 5 X 4 A A A A B C - C - B A - A - A A C - B B - C Y 0 Y 2 Y 4 Y 6 + D E F G E G - D - F F - D G E G - F E - E Y 1 Y 3 Y 5 Y 7 - - - ( 8 a ) Equation 7-7 (a) and 8-8 (a) also can be applicable to the Z=CY of decision as equation (6a) t, or at anti-discrete cosine conversion, Y=ZC.Equation (7a) and advantage (8a) are that the multiplying amount that reduces the positive discrete cosine transform of each one dimension reaches 50%.
Fig. 1 is the block diagram of just/anti-discrete string conversion of two dimension of expression.Special 4,791,598 as the U.S., wherein the positive discrete cosine transform of first one dimension 3 is via n 1Line receives an input data matrix X.The positive discrete cosine transform 3 of this one dimension is according to mode (5) or (7a) conversion input data matrix Y, and via n 2Line exports transpose memory 5 to.Transpose memory 5 is with matrix Y transposition placed in the middle and via n 3The transposition Y of line Y tExport second positive discrete cosine transform 7 of one dimension to.The last positive discrete cosine transform 7 of this one dimension is according to equation (6), (6a) or (7a) via n 4A two dimension output of line output transition matrix Z.
With a positive discrete cosine transform circuit is example, one of the positive discrete cosine transform circuit output of one dimension " OK-and the hurdle " (row-column) in proper order, that is to say Y 0, Y 1, Y 2..., Y 63(it seems that with this one 8 * 8 entry of a matrix element is Y Ij, i, j=0,1,2 ..., 7).Transpose memory 5 writes data in view of the above and with transposition order (hurdle-OK) it is read.That is to say Y 0, Y 8, Y 24, Y 32..., Y 1, Y 9, Y 17, Y 25, Y 33..., Y 63But, the just discrete string change-over circuit 7 of equation (7a) demonstration one dimension need once be read two elements from matrix Y: (Y 0, Y 56), (Y 8, Y 48), (Y 16, Y 48) ..., (Y 1, Y 57), (Y 9, Y 49) ..., (Y 31, Y 39), or be referred to as " towing hurdle-OK " (shuffled column-row) in proper order.Data from transposition storer output is then received by pre-treatment (pre-processing) circuit that contains ALU (ALU) 76 and translation input/parallel output register 7C, and to pull two elements in hurdle-once export to the row sequential parallel.
With an anti-discrete cosine circuit is example, one dimension change-over circuit 3 with matrix battle array Y with Y 0, Y 7, Y 1, Y 6, Y 2, Y 5..., Y 8, Y 15, Y 9, Y 14..., Y 59, Y 60(towing row-hurdle order) exports transpose memory to, and exports (Y with the mixing hurdle-row order of transposition by transpose memory 0, Y 56, Y 8, Y 48, Y 16, Y 40..., Y 1, Y 57, Y 9, Y 49..., Y 31, Y 39).But, equation (8a) demonstration one dimension anti-discrete cosine change-over circuit need once be read two elements from matrix Y: (Y 0, Y 8), (Y 16, Y 24) ..., (Y 1, Y 9), (Y 17, Y 25) ..., (Y 55, Y 63).Again, the translation of pre processing circuit input/parallel output register receives towing hurdle-row order output data from the storer that transplants, and once exports two elements with same order.
With the architecture design of Fig. 1 is that complete run-in index has its necessity, that is to say, matrix Y placed in the middle is from the positive discrete cosine transform 3 of first one dimension, through transpose memory 5, to the circulation between second positive discrete cosine transform 7 of one dimension without any interruption.But, the normally the most difficult part of transpose memory 5.For example, if data is to write transpose memory 5 in proper order with row-hurdle, then be to call over hurdle-row.If the required time of read-write respectively is T, that petty traditional transpose memory then needs the 2T time to handle a matrix Y.
In order to reduce the processing time, different traditional transpose memories are suggested in succession.Framework has the earliest used two transpose memories and each matrix Y placed in the middle of interaction process.In first period, first transpose memory writes first matrix Y1 placed in the middle according to row-hurdle order.Second period, in second period, second transpose memory writes second matrix Y2 placed in the middle in proper order according to row-hurdle, and simultaneously, Y1 reads from first transpose memory with hurdle-row order.By that analogy, the 3rd period, first transpose memory writes Y3 in proper order with row-hurdle, and calls over Y2 from second transpose memory with hurdle-row simultaneously.The shortcoming of this transpose memory is to need two storeies.
Fig. 2 has represented by United States Patent (USP) 4,791, the 598 another kind of traditional approachs that propose.As shown in the figure, transpose memory 5 comprises a row the parallel register row 15 with a row of translation register row 13.Matrix Y translation input translation register row 13 (with hurdle-row in proper order) make each register 13-1,13-2 ..., the corresponding row of 13-N storage matrix Y.When whole matrix all deposits translation register in and arranges 13, each register 13-1,13-2 ..., 13-N just passes to their data corresponding parallel register 15-1 abreast, 15-2 ..., 15-N.Thus, parallel register row's 15 data just can be imported ALU 17.Simultaneously, translation register row 13 can receive next matrix Y at once.The shortcoming of this framework is the space of the many registers of waste.United States Patent (USP) 5,053,985 propose another kind of framework, as shown in Figure 3.The positive discrete cosine transform 20 of one dimension is handled input matrix X with the speed that doubles the data input rate.Comprise that with a positive discrete cosine transform 20 of one dimension preceding register and ALU 30 come the pre-treatment data, come the processing array multiplication, also have late register and ALU 40 with multiplier and totalizer 35.The matrix Y placed in the middle that this one-dimensional discrete cosine conversion 20 produces is stored in a RAM60, if say with row-hurdle order, is to call over and handled by the positive discrete cosine transform of one dimension with hurdle-row from RAM 60 then.Because the processing speed of its conversion is the twice of matrix X input, therefore with " the 2-D discrete cosine switching center processor of a kind of 100MHz (the A 100MHz 2-D DCT CoreProcessor) " of S.Urampto el al., IEEE J.of Solid State Cirs., Vol.27, No.4, P.429-99, Apr.1992 is an example, the data input rate be limited in processor speed half.
United States Patent (USP) 5,042,007 and 5,177,704 propose a transpose memory device that connects to FIFO (First-in first-out) with translation register.Two patents are all with (N 2+ 1) n bit translation register and (2N 2-2) 2 * 1 multiplexers are combined as transpose memory.Similarly, United States Patent (USP) 4,903,231 propose another with N 2Individual translation register and about N 2The transpose memory that individual multiplexer makes up.This framework can be exported matrix element in proper order or import with row one hurdle or hurdle delegation.But the shortcoming of these frameworks is the areas that accounted for many registers and multiplexer.
United States Patent (USP) 4,769,790 propose another kind of framework, are hurdle (or row) the input delay circuits (delay circuit) with an input matrix.Postpone a hurdle input sorting circuit (distribution circuit), and be rotated.The hurdle that is rotated exports second delay circuit to again with it delay.This framework has accounted for very big space at the used translation register of each delay circuit.
United States Patent (USP) 4,918,527 transpose memories that propose are divided into two halves.When data is written into a wherein half, second half then reads data.And the data that odd number is capable is to deposit the two halves storer in two pairs two over-over mode, that is to say, and is capable at each odd number, the element switch type on the hurdle that each is right deposit storer in, till whole matrix placed in the middle all deposits in.This framework has used a large amount of decoding circuits, has therefore accounted for very big space.
The major defect of aforementioned several frameworks is that they have accounted for sizable area.Two dimension just or the anti-discrete cosine converting structure more may cause the register overload of front processor.In addition, aforementioned various frameworks are neither can very flexiblely be read, and writes data, in other words, be expert at exactly-hurdle, and the hurdle-OK, conversion between towing hurdle-row and the towing row-hurdle.At last, traditional transpose memory need just/document before removing fully before the data that reads or writes between anti-discrete cosine conversion.Target of the present invention promptly is the shortcoming that is to overcome foregoing structure.
A transpose memory is accepted a matrix according to first order, according to second order matrix element of receiving is just being exported to or the anti-discrete cosine change-over circuit then.A transpose memory comprises 4 dual-port random access memories (dual-portRAM).The element of 1/4th quadrants of each dual-ported memory storage matrix.A transpose memory comprises that simultaneously a register is used for the matrix element of each reception is write a certain specific storer.In addition, comprise that also a register calls over element to (element pair) according to second.
First register writes a matrix element the specific dual-ported memory of selection to reach the purpose of reading a pair of element simultaneously.And must not increase extra register.If element is to output to positive discrete cosine transform circuit, each dual-ported memory is assigned with 1/4 matrix element.Thus, first register writes a specific dual-ported memory that is assigned to each matrix element.That is to say that the matrix element in the upper left corner writes first dual-ported memory; The upper right corner write second; The lower left corner write the 3rd; The lower right corner write the 4th.If element is to output to the anti-discrete cosine change-over circuit, each dual-ported memory then is assigned to matrix element to go the hurdle interactively.Thus, first register writes specific dual-ported memory respectively with the matrix element on " list " or " two " row and hurdle.That is to say that the matrix element that is positioned at single file and single hurdle writes first storer; Be positioned at second storer that write on the two hurdles of single file; Be positioned at the 3rd storer that write on duplicate rows list hurdle; Be positioned at the 4th storer that write on the two hurdles of duplicate rows.
Therefore, first kind of wiring method allows element to reading with towing order (Shuffledorder).For instance, element Y 0, Y 8, Y 16, Y 24(being positioned at the upper left corner of matrix) writes on first storer; And be positioned at the element Y in the matrix lower left corner 32, Y 40, Y 48, Y 56Be stored in second storer.Therefore, second register can be at an easy rate to pull the capable element that calls in hurdle to (Y 0, Y 56), (Y 8, Y 48), (Y 16, Y 40), (Y 24, Y 32).Similarly, second kind of wiring method allows element to reading with non-towing order (Non-Shuffled order).For instance, be positioned at the element Y of matrix single file 0, Y 16, Y 32, Y 48Write on first storer; And be positioned at the element Y of matrix duplicate rows 8, Y 24, Y 40, Y 56Be stored in second storer.Therefore, second counter can be at an easy rate to pull the capable element that calls in hurdle to (Y 0, Y 8), (Y 16, Y 24), (Y 32, Y 40), (Y 48, Y 56).
In fact, the transpose memory that the present invention proposes can be assigned between two one dimension change-over circuits, for example between two just discrete (or anti-discrete) cosine change-over circuits.Thus, first and second registers can reach the computing (Pipeline Transpose operation) of pipeline transposition with the synchronization of the matrix element of input.That is to say that first register is when second register read the matrix element that last time writes, with the matrix element write store that receives.And the memory storage element that writes the element place was vacateed earlier before next element writes.So mutual read-write, this transpose memory promptly can a kind of conversion pattern to write element right, again to read with one or another kind of pattern.
Briefly, this transpose memory can be with four storeies than small size, and two registers and a little logical circuit combine.And under various input sequence, export data simply regularly.
Fig. 1 be a conventional two-dimensional just/block diagram of anti-discrete cosine change-over circuit;
Fig. 2 is the block diagram of a traditional transpose memory;
Fig. 3 is the block diagram of another kind of traditional transpose memory;
Fig. 4 be one according to the present invention Wei Ji Foundation two dimension just/block diagram of anti-discrete cosine change-over circuit;
Fig. 5 is a block diagram according to detailed transpose memory of the present invention;
Fig. 6 (a) (b), (c), is according to the address register order that specifically writes of the present invention (d);
The transpose memory state of Fig. 7 example key diagram 5: according to Fig. 6 (a) write address register order and with the state after partly the writing of first matrix;
Fig. 8 (a) (b), (c), is according to the address register order of specifically reading of the present invention (d);
The transpose memory state of Fig. 9 example key diagram 5: according to Fig. 6 (d) write address register order, the part of second matrix is write, and according to the state after the address register calls over first matrix of reading of Fig. 8 (a);
Figure 10 example illustrates the transpose memory state of another Fig. 5: according to Fig. 6 (a) write address register order, with partly writing of the 3rd matrix, and call over second state behind the matrix according to the address register that reads of Fig. 8 (d);
Figure 11 example illustrates the transpose memory state of a kind of Fig. 5; According to Fig. 6 (b) write address register order and with the state after partly the writing of first matrix;
The transpose memory state of Figure 12 example key diagram 5: according to Fig. 6 (c) write address register order, with partly writing of second matrix, and according to the form after the address register calls over first matrix of reading of Fig. 8 (b);
The transpose memory state of Figure 13 example key diagram 5; According to Fig. 6 (b) write address register order, with partly writing of the 3rd matrix, and call over second state after the square limit according to the address register that reads of Fig. 8 (c).
Fig. 4 be a two dimension just/anti-discrete cosine change-over circuit framework 100.As shown in the figure, according to two dimension of the present invention just/anti-discrete cosine change-over circuit framework 100 comprising an one dimension just or anti-discrete cosine change-over circuit 110, wherein exporting matrix Y placed in the middle is Y 0, Y 2..., Y 63(positive discrete cosine transform) or Y 0, Y 7, Y 1, Y 6, Y 2, Y 5..., Y 60(anti-discrete cosine conversion) be order.According to the present invention, the matrix Y of output passes to a transpose memory 120 via the bus-bar 111 of n bit.This transpose memory 120 output transposed matrix Y tBe by the extremely positive discrete cosine transform circuit 140 of the capable order in towing hurdle (or capable order in hurdle is to anti-discrete cosine change-over circuit 140), and once export two elements.Matrix Y tThen reach and only contain an ALU (Aritnmetic Logic Unit---ALU) 135 front processor 130.Since the present invention's transpose memory 120 is output matrix Y correctly t, so the required translation/parallel register 13,15 (as shown in Figure 2) of prior art can omit.
Fig. 5 is the detail drawing of transpose memory of the present invention.As shown in the figure, one can transposition N * N matrix transpose memory comprise four dual-port random access memories 141,142,143,144.Each dual-port random access memory 141-144 is
Figure C9410911500161
Be used for storing a matrix Y placed in the middle element 1/4th.Each dual-port random access memory 141-144 comprises a corresponding data input port one 41-1,142-1,143-1,144-1, be used for respectively receiving via n bit bus-bar 111, from the one dimension of Fig. 4 just/element of the matrix Y placed in the middle of anti-discrete cosine change-over circuit 110 outputs.Each dual-port random access memory 141-144 also comprises a corresponding address and writes port one 41-2,142-2, and 143-2,144-2 is connected to one respectively and writes address register 151.In addition, each dual-port random access memory comprises permission end (enable terminal) 141-3 again, 142-3, and 143-3,144-3, respectively via corresponding line 151-1,151-2,151-3,151-4 are connected to and write address register 151.When writing address register 151 at 151-1,151-2,151-3, lines such as 151-4 produce enabling signal, and corresponding dual-port random access memory is then licensed.Licensed dual-port random access memory is then according to the address indication that writes address register 151.The matrix element writing memory element placed in the middle of data input line 111 will be appeared at.
Except writing address register 151, this transpose memory 120 ports contain one and read address register 152, and port one 41-4,142-4,143-4,144-4 are read in the address that is connected respectively to each dual-port random access memory 141-144.In addition, read two of address register 152 outputs and select control signal S1 and S2.When receiving the address of reading address register 152, each dual-port random access memory 141-142 reads the memory element value that this address refers to simultaneously.Dual-port random access memory 141-144 is via read port 141-5,142-5, and 143-5 and 144-5 read element.
Select control signal S1 to be connected to first multiplexer 161.As shown in the figure, multiplexer 161 receives the output element from the read port 142-5 and the 143-5 of dual-port random access memory 142 and 143.Select control signal S2 to be connected to second and third multiplexer 162 and 163.A data input of multiplexer 162 and formula multiplexer 163 is passed in the output of multiplexer 161 respectively, dual-port random access memory 141 then output data are to another data input of multiplexer 162, and dual-port random access memory 144 output data are to another data input of multiplexer 163.
As implied above, the parallel port one 41-5 output matrix element that reads from the address of each dual-port random access memory 141-144.But through selecting control signal S1 and S2, dual-port random access memory 141,142,143,144 wherein two element can the time export.
It below is the calculation specification of this transpose memory 120.For convenience's sake, suppose that each matrix Y placed in the middle is 8 * 8, and element is numbered Y 0, Y 1..., Y 8, Y 9..., Y 63Trip hurdle order.Writing address register 151 can come the memory element of dual-port random access memory 141-144 is counted according to a string counting sequence that writes.For instance, Fig. 6 (a), (b), (c), (d) expression four kinds of write address counter 151 is different writes counting sequence.As shown in the figure, each dual-port random access memory 141-144 comprises the memory element group 171-0 to 171-15 that is used for storing 16 matrix elements respectively, 172-0 to 172-15,173-0 to 173-15,174-0 to 174-15, numbering 1 to 64 has represented that write address counter 151 writes order of each element.For example, according to Fig. 6 (a), in period 1 (cycle), write address counter 151 points to the memory element 171-0 of dual-port random access memory 141.Second round, then point to the storage file 171-1 of dual-port random access memory 141, by that analogy.
Hence one can see that, and the counting sequence (Fig. 6 (a)~(d)) that write address counter 151 is selected is according to the input of matrix Y and the sequence order (sequence order) of output element decide between two parties.Its input and output sequence see table one in proper order for details.Wherein output sequence represents that with parantheses the element of each cycle output is right.
The example of the just discrete string conversion of a two dimension below will be discussed.As shown in Table 1, the positive discrete cosine transform circuit 110 of the one dimension of Fig. 4 with each matrix Y placed in the middle to go the output of hurdle sequence.Data input bus-bar 111 is with the speed input transpose memory 120 of one-period one element in the element of Y.Transpose memory 120 exports the element of the Y of input to front processor 135 again to drag the capable order of electric fence, and each towing hurdle all heavily covers (as table 1) one time.Suppose a string matrix Y1 of positive discrete cosine transform circuit 110 sequences of one dimension ground output, Y2, Y3, Y4 ..., write address counter 151 is according to the counting sequence counting of Fig. 6 (a).For example, at first with the element of importing
Figure C9410911500181
Write the memory element 171-0 of dual-port random access memory 141, second element
Figure C9410911500182
Writing memory element 171-1, by that analogy.The element of the matrix Y in preceding 57 cycles write and the appointment of memory element sees table 2 for details.Fig. 7 represents the state of dual-port random access memory 141-144 after the 57th cycle.Wherein dash area is represented empty memory element.
Similar with write address counter 151, read the counting sequence counting of address counter 152 according to Fig. 8 (a)-(d).Shown in Fig. 8 (a)-(d), read address counter 152 and pointing to two memory elements that are positioned at different dual-port random access memories with one-period.For example, according to Fig. 8 (a),, read address counter 152 and read element in memory element 170-0 and the 173-0 from dual-port random access memory 141 and 143 respectively in the period 1.In addition, the memory element of each Output bar correspondence of matrix all must heavily cover and read once.Fig. 8 (a) for example reads address counter 152 and points to a hurdle of dual-port random access memory 141 and 143 in first to fourth cycle, memory element 171-0 just, 173-0,171-4,173-4,171-8,173-8,171-12,173-12.This hurdle read once again in the 5th to eight cycle again.Thus, reading address counter 152, to read element respectively in the cycle one to four right ( Y 0 1 , Y 56 1 ) , ( Y 8 1 , Y 48 1 ) , ( Y 16 1 , Y 40 1 ) , ( Y 24 1 , Y 32 1 ) , Read once respectively again to eight in cycle five.
To write and read address counter 151 and 152 and can write and read matrix element simultaneously in order to make, both must the phase mutually synchronization.The dual-port random access memory state of supposing the 57th all after dates is that shown in Figure 7, and from the 58th cycle, reading address counter 152, can to begin the sensor matrix element according to the order shown in Fig. 8 (a) right.So the 58th cycle, write address counter 151 is with matrix element Be stored in the memory element 173-1.In addition, reading address counter 152, also to read element from memory element 171-0 and 173-0 right
Figure C9410911500193
Read the memory element 171-0 that address counter 152 need only export an address to dual-port random access memory 141-144,172-0,173-0, it is right that 174-0 can reach the element that reads.This is that address counter 152 has produced suitable S1 simultaneously and S2 selects control signal because read, together to control multiplexer 161,162,163 and to select correct element to output.With above-mentioned is example, and S1 and S2 promptly control the output that multiplexer 161 is selected dual-port random access memory 143, and control multiplexer 162 and 163 is selected the output of dual-port random access memory 141 and multiplexer 161 respectively.
Thus, in the 59th cycle, read address counter 151 with element
Figure C9410911500201
Writing memory element 173-2, and reading address counter 152, to read element from memory element 171-4 and 173-4 right This process continues up to the 64th cycle, sees table 3 for details.After the 64th cycle, write address counter 151 is with the element that shows most of matrix Y1 Write, and read address counter 152 and also almost the element in first hurdle of matrix Y1 is run through.At this moment, write address counter 151 is with the element of matrix Y2 ( Y 0 2 , Y 1 2 , · · · · · · , Y 8 2 , Y 9 2 , · · · · · · , Y 63 2 ) Write dual-port random access memory 141-144.But, write address counter 151 reads in the memory element that address counter 152 no longer reads so that element is write only.For example, write address counter 151 can write the present memory element 171-0 that stores the matrix Y1 first hurdle element, 173-0,171-4,173-4,171-8,173-8,171-12,173-12 with the element of matrix Y2 first row.For the element of suitable storage matrix Y2, those read the memory element that address counter 152 no longer needs to write address counter 151 according to the sequential counting of Fig. 6 (d).Therefore, the 65th cycle, when reading address counter 152 from memory element 171-12, it is right that 173-12 reads element
Figure C9410911500206
Simultaneously, write address counter 151 is with element
Figure C9410911500207
Writing memory element 171-0.Matrix Y1 read and writing of matrix Y2 continues up to for the 121st cycle, see table 3 for details.After the 121st cycle, the state of dual-port random access memory 141-144 sees Fig. 9 for details.Wherein dash area represents to read the memory element part that address counter 152 " no longer needs ".
From the 122nd beginning in cycle, read address counter 152 and begin the element of matrix Y2 is read.But because matrix Y2 write the write sequence (Fig. 6 (a)) that counting sequence (Fig. 6 (d)) is different from matrix Y1, must count with different counting sequences (Fig. 8 (d)) so read address counter 152.For example, the 122nd cycle, when reading address counter 152 from memory element 171-0, it is right that 172-0 reads element Simultaneously, write address counter 151 is with element
Figure C9410911500212
Writing memory element 172-4.This simultaneously respectively according to Fig. 8 (d) and Fig. 6 (d) order read and write continue up to the 128th cycle, see table 4 for details.
The 129th cycle, write address counter 151 begin with the element sequence of matrix Y3 write those and read the memory element that address counter 152 no longer needs.At this moment, write address counter 151 is to count according to the counting sequence of Fig. 6 (a).Therefore, the 129th cycle, when reading address counter 152 from memory element 171-3, it is right that 172-3 reads element
Figure C9410911500213
Simultaneously, write address counter 151 is with element
Figure C9410911500214
Writing memory element 171-0.This while writes according to the matrix Y3's of Fig. 6 (a) order, and according to the reading of the matrix Y2 of Fig. 8 (d) order, continues up to the 185th cycle, sees table 4 for details.After the 185th cycle, the state of dual-port random access memory 141-144 sees Figure 10 for details.Wherein dash area represents to read the memory element part that address counter 152 " no longer needs ".
As shown in figure 10, the element that matrix Y1 is stored among the stored element of matrix Y3 and Fig. 7 is identical.Therefore, writing with reading of previous matrix of a matrix can continue ad infinitum simultaneously.That is to say, as long as odd number matrix Y1, Y3 ... counting sequence with Fig. 6 (a) writes, the old Y2 of even number square, Y4, counting sequence with Fig. 6 (d) writes, and alternately with Fig. 8 (a), counting sequence (d) is read, that petty transpose memory 120 can have no discontinuously, ceaselessly transposition matrix Y placed in the middle one by one.Must notice that the counting sequence that writes of Fig. 6 (a) is mutual transposition.Fig. 8 (a) is mutual transposition with the counting sequence that reads of Fig. 8 (d).
This transpose memory 120 also can be applicable to a two-dimension anti-discrete cosine change-over circuit.As shown in table 1, first one dimension anti-discrete cosine change-over circuit 110 is to pull capable hurdle sequence via bus-bar 111 output data.Second one dimension anti-discrete cosine change-over circuit 140 must once be read two elements of matrix Y placed in the middle with the capable order in hurdle, and each hurdle must heavily cover and reads once.Suppose the one dimension anti-discrete cosine conversion sequence ground output matrix Y1 of Fig. 4, Y2, Y3 ... element, write address counter 151 writes transpose memory 120 according to the counting sequence of Fig. 6 (b) with the element of matrix Y1, as table 5.Just, incite somebody to action in the period 1 Writing memory element 171-0; In second round, with element
Figure C9410911500222
Writing memory element 172-3; In the 3rd accent phase, with element
Figure C9410911500223
Writing memory element 172-0; In the period 4, with element
Figure C9410911500224
Writing memory element 171-3, or the like.Figure 11 represents the state of the 56th all after date dual-port random access memory 141-144, and wherein dash area is represented empty memory element.
Table 6 has shown between cycle 57 and 120 that dual-port random access memory 141-144 is the contrast of read/write matrix element simultaneously.From cycles 57 beginning, read the element of address counter 152 according to the counting sequence sensor matrix Y1 of the 8th (b) figure.Therefore, the 57th cycle, when reading address counter 152 from memory element 171-0, it is right that 173-0 reads element
Figure C9410911500225
Simultaneously, write address counter 151 is with element Simultaneously, write address counter 151 is with element Writing memory element 173-12.
The 65th cycle, write address counter 151 begin with the element sequence of matrix Y2 write those and read the memory element that address counter 152 no longer needs.At this moment, write address counter 151 is to count according to the counting sequence of Fig. 6 (c).Therefore, the 65th cycle, when reading address counter 152 from memory element 172-0, it is right that 174-0 reads element The time, write address counter 151 is with element
Figure C9410911500231
Writing memory element 171-0.The matrix of this while according to Fig. 6 (c) order writes, and reads with matrix according to Fig. 8 (b) order, continues up to the 120th cycle, sees table 6 for details.After the 120th cycle, the state of dual-port random access memory 141-144 sees Figure 12 for details.Wherein dash area represents to read the memory element part that address counter 152 " no longer needs ".
Table 7 has shown between cycle 121 and 184 that dual-port random access memory 141-144 is the contrast of read/write matrix element simultaneously.Write address counter 151 counts up to the 128th cycle in order to write the element of matrix Y2 according to the counting sequence of Fig. 6 (c) always.And then according to Fig. 8 (c) counting from 184 cycles of the 121st cycle to the in order to write the element of matrix Y3.After the 184th cycle, the state of dual-port random access memory 141-144 sees Figure 13 for details.Wherein dash area is represented the memory element part that read address counter 152 " no longer needs ".State shown in Figure 13, very similar with Figure 11, therefore the process of above-mentioned while read/write matrix element can ceaselessly continue.
Just as the positive discrete cosine transform circuit 100 of two dimension, this transpose memory 120 can have no to be interrupted in two-dimension anti-discrete cosine change-over circuit 100, ceaselessly matrix placed in the middle of transposition.In the case, write address counter 151 writes matrix element with the counting sequence of Fig. 6 (b), (c) alternately, and reads matrix element with the counting sequence of Fig. 8 (b), (c) alternately.Must notice that Fig. 6 (b) and the counting sequence that writes of Fig. 6 (c) are mutual transposition.Fig. 8 (b) figure is mutual transposition with the counting sequence that reads of Fig. 8 (c).
As mentioned above, write address counter 151 is in order to make matrix element to reading at an easy rate with the order that matrix element is stored in dual-port random access memory 141-144.For example, when matrix element when exporting the positive discrete cosine transform of one dimension circuit 140 to, output is the capable order in towing hurdle in proper order, and each towing hurdle all heavily covers once, just (Y 0, Y 56) ..., (Y 24, Y 32) ..., (Y 0, Y 56) ..., (Y 24, Y 32), (Y 0, Y 56) ..., (Y 24, Y 32), (Y 31, Y 39) ...Therefore the right element of each element reads address counter 152 and can read easily from different quadrants.Same, when matrix element when exporting one dimension anti-discrete cosine change-over circuit 140 to, output is that the hurdle is capable in proper order in proper order, and every hurdle all heavily covers once, just (Y 0, Y 8) ..., (Y 48, Y 56), (Y 0, Y 8) ..., (Y 48, Y 56), (Y 55, Y 63) ...Therefore the right element of each element reads address counter 152 and can read easily respectively from single file and duplicate rows.
This shows that the particular storage element that write address counter 151 writes dual-port random access memory 141-144 with matrix element makes and to read address counter 152 the sensor matrix element is right at an easy rate.This need only suitably control S1 and S2 selects control signal to reach.For example, shown in Fig. 8 (a) and table 3, in order to read element to (Y 8, Y 48), reading address counter 152 need only OPADD and point to memory element 171-4,172-4,173-4,174-4, suitably select control signal S1 and S2 with the memory element 171-4 of output dual-port random access memory 141 and the memory element 173-4 that double-port random is deposited storer 143 simultaneously, that corresponding matrix element is to (Y 8, Y 48) can read.
This transpose memory 120 also can be exported different matrixes in proper order according to different output sequences.For example, the matrix of first input the hurdle is capable exports in proper order with towing, and the matrix of second input is with the capable phyllotaxy output in hurdle.But in this case, first matrix must all be read from transpose memory 120, and second matrix could write again.Therefore, caused the cycle delay of a matrix length.But, but improve the cycle delay of the system length of 120 palpuses of traditional transpose memory greatly.The cycle delay of system length promptly refers to a matrix reception and registration through two one dimension change-over circuits, transpose memory, and required time of front/rear processor.
Briefly, this transpose memory comprises four dual-ported memories, and write address counter and one read address counter.If a matrix is for exporting positive discrete cosine transform circuit to, that petty write address counter writes specific dual-ported memory with matrix element according to different quadrants.If a matrix is for exporting the anti-discrete cosine change-over circuit to, that petty write address counter writes specific dual-ported memory with matrix element according to list or Shuan Lan and row.Therefore, this transpose memory accounts for less integrated circuit area.
Though the present invention illustrates that with the every embodiment that is disclosed the claim scope of wherein asking is not limited to by these embodiment contents.Any other is according to the interest field of asking in the present invention institute, and the essence of its original idea, principle, essential ideas and intension, and implementer in addition all have been included within the scope that every claim defined of the present invention by following the application.
Table 1
Circuit 110 List entries Circuit 140 Output sequence
1-D DCT Y 0,Y 1,...,Y 8, Y 9,...,Y 63 1-D DCT (Y 0,Y 56),...,(Y 34,Y 32),(Y 0, Y 56),...,(Y 24,Y 32),(Y 3239)
1-D IDCT Y 0,Y 7,...,Y 8, Y 51,...,Y 60 1-D IDCT (Y 0,Y 8),...,(Y 48,Y 56),(Y 0, Y 8),...,(Y 48,Y 56),(Y 55,Y 63)
1-D DCT Y 0,Y 1,...,Y 3, Y 9,...,Y 63 1-D IDCT (Y 0,Y 8),...,(Y 46,Y 56),(Y 0, Y 9),...,(Y 48,Y 56),(Y 55,Y 63)
1-D IDCT Y 0,Y 7,..,Y 8, Y 51,...,Y 60 1-D DCT (Y 0,Y 56),...,(Y 24,Y 32),(Y 3, Y 55),...,(Y 24,Y 32),(Y 31,Y 39)
Table 2
Cycle Input
1 Y 1 0 171-0
2 Y 1 1 171-1
3 Y 1 2 171-2
4 Y 1 3 171-3
5 Y 1 4 172-3
6 Y 1 5 172-2
7 Y 1 6 172-1
8 Y 1 7 172-0
9 Y 1 8 171-4
10 Y 1 9 171-5
11 Y 1 10 171-6
12 Y 1 11 171-7
13 Y 1 12 172-7
14 Y 1 13 172-6
15 Y 1 14 172-5
16 Y 1 15 172-4
Cycle Input
17 Y 1 16 171-8
18 Y 1 17 171-9
19 Y 1 18 171-10
20 Y 1 19 171-11
21 Y 1 20 172-11
22 Y 1 21 172-10
23 Y 1 22 172-9
24 Y 1 23 172-8
25 Y 1 24 171-12
26 Y 1 25 171-13
27 Y 1 26 171-14
28 Y 1 27 171-15
29 Y 1 28 172-15
30 Y 1 29 172-14
31 Y 1 30 172-13
32 YU 31 172-12
Cycle Input
33 Y 1 32 173-12
34 Y 1 33 173-13
35 Y 1 34 173-14
36 Y 1 35 173-15
37 Y 1 36 174-15
38 Y 1 37 174-14
39 Y 1 38 174-13
40 Y 1 39 174-12
41 Y 1 40 173-8
42 Y 1 41 173-9
43 Y 1 42 173-10
44 Y 1 43 173-11
45 Y 1 44 174-11
46 Y 1 45 174-10
47 Y 1 46 174-9
48 Y 1 47 174-8
Cycle Input
49 Y 1 48 173-4
50 Y 1 49 171-5
51 Y 1 50 173-6
52 Y 1 51 173-7
53 Y 1 52 174-7
54 Y 1 53 174-6
55 Y 1 54 174-5
56 Y 1 55 174-4
57 Y 1 56 173-0
Table 3
Cycle Input Output
58 Y 1 57 173-1 (Y 1 0,Y 1 56) 171-0,173-0
59 Y 1 58 173-2 (Y 1 8,Y 1 48) 171-4,173-4
60 Y 1 59 173-3 (Y 1 16,Y 1 40) 171-8,173-8
61 Y 1 60 174-3 (Y 1 24,Y 1 32) 171-12,173-12
62 Y 1 61 174-2 (Y 1 0,Y 1 56) 171-0,173-0
63 Y 1 62 174-1 (Y 1 8,Y 1 48) 171-4,173-4
64 Y 1 63 174-0 (Y 1 16,Y 1 40) 171-8,173-8
65 Y 2 0 171-0 (Y 1 24,Y 1 32) 171-12,173-12
66 Y 2 1 171-4 (Y 1 1,Y 1 57) 171-1,173-1
67 Y 2 2 171-8 (Y 1 9,Y 1 49) 171-5,173-5
68 Y 2 3 171-12 (Y 1 17,Y 1 41) 171-9,173-9
69 Y 2 4 173-12 (Y 1 25,Y 1 33) 171-13,173-13
70 Y 2 5 173-8 (Y 1 11 57) 171-1,173-1
71 Y 2 6 173-4 (Y 1 91 49) 171-5,173-5
72 Y 2 7 173-0 (Y 1 17,Y 1 41) 171-9,173-9
73 Y 2 8 171-1 (Y 1 25,Y 1 33) 171-13,173-13
74 Y 2 9 171-5 (Y 1 2,Y 1 59) 171-2,173-2
Cycle Input Output
75 Y 2 10 171-9 (Y 1 10,Y 1 50) 171-6,173-6
76 Y 2 11 173-2 (Y 1 18,Y 1 42) 171-10,173-10
77 Y 2 12 173-3 (Y 1 26,Y 1 34) 171-14,173-14
78 Y 2 13 174-3 (Y 1 2,Y 1 58) 171-2,173-2
79 Y 2 14 173-5 (Y 1 10,Y 1 50) 171-6,173-6
80 Y 2 15 173-1 (Y 1 18,Y 1 42) 171-10,173-10
81 Y 2 16 171-2 (Y 1 36,Y 1 34) 171-14,173-14
114 Y 2 49 172-5 (Y 1 7,Y 1 63) 172-0,174-0
115 Y 2 50 172-9 (Y 1 15,Y 1 55) 172-4,174-4
116 Y 2 51 172-13 (Y 1 23,Y 1 47) 172-8,174-8
117 Y 2 52 174-13 (Y 1 31,Y 1 39) 172-12,174-12
118 Y 2 53 174-9 (Y 1 71 63) 172-0,174-0
119 Y 2 54 174-5 (Y 1 151 55) 172-4,174-4
120 Y 2 55 174-1 (Y 1 23,Y 1 47) 172-8,174-8
121 Y 2 56 172-0 (Y 1 31,Y 1 39) 172-12,174-12
Table 4
Cycle Input Output
122 Y 2 57 172-4 (Y 2 0,Y 56) 171-0,172-0
123 Y 2 58 172-8 (Y 2 8,Y 2 45) 171-1,172-1
124 Y 2 59 172-12 (Y 1 16,Y 1 40) 171-2,172-2
125 Y 2 60 174-12 (Y 2 24,Y 2 32) 171-3,172-3
126 Y 2 61 174-8 (Y 2 0,Y 2 56) 171-0,172-0
127 Y 2 62 174-4 (Y 2 8,Y 2 48) 171-1,172-1
128 Y 2 63 174-0 (Y 2 16,Y 2 40) 171-2,172-2
129 Y 3 0 171-0 (Y 2 24,Y 2 32) 171-3,172-3
130 Y 3 1 171-1 (Y 2 1,Y 2 57) 171-4,172-4
131 Y 3 2 171-2 (Y 2 9,Y 2 49) 171-5,172-5
132 Y 3 3 171-3 (Y 2 17,Y 2 41) 171-6,172-6
133 Y 3 4 172-3 (Y 2 25,Y 2 33) 171-7,172-7
134 Y 3 5 172-2 (Y 2 1,Y 2 57) 171-4,172-4
135 Y 3 6 172-1 (Y 2 9,Y 2 49) 171-5,172-5
136 Y 3 7 172-0 (Y 2 17,Y 1 41) 171-6,172-6
137 Y 3 8 171-4 (Y 2 25,Y 2 33) 171-7,172-7
138 Y 3 9 171-5 (Y 2 2,Y 2 58) 171-8,172-8
Cycle Input Output
139 Y 3 16 171-6 (Y 2 10,Y 2 50) 171-9,172-9
140 Y311 171-7 (Y 2 18,Y 2 42) 171-10,172-10
141 Y 3 12 172-7 (Y 2 26,Y 1 34) 171-11,172-11
142 Y 3 13 172-6 (Y 2 2,Y 2 58) 171-8,172-8
143 Y 3 14 172-5 (Y 2 10,Y 2 50) 171-9,172-9
144 Y 3 15 172-4 (Y 2 18,Y 1 42) 171-10,172-10
145 Y 3 16 171-8 (Y 2 26,Y 2 34) 171-11,172-11
178 Y 3 49 173-5 (Y 2 7,Y 2 63) 173-0,174-0
179 Y 3 50 173-6 (Y 2 15,Y 2 55) 173-1,174-1
180 Y 3 51 173-7 (Y 2 23,Y 2 47) 173-2,174-2
181 Y 3 52 174-7 (Y 2 31,Y 2 39) 173-3,174-3
182 Y 3 53 174-6 (Y 2 7,Y 2 63) 173-0,174-0
183 Y 3 54 174-5 (Y 2 15,Y 2 55) 173-1,174-1
184 Y 3 55 174-4 (Y 2 23,Y 2 47) 173-2,174-2
185 Y 3 56 173-0 (Y 2 31,Y 39) 173-3,174-3
Table 5
Cycle Input
1 Y 1 0 171-0
2 Y 1 7 172-3
3 Y 1 1 172-0
4 Y 1 6 171-3
5 Y 1 2 171-1
6 Y 1 5 172-2
7 Y 1 3 172-1
8 Y 1 4 171-2
9 Y 1 8 173-0
10 Y 1 15 174-3
11 Y 1 9 174-0
12 Y 1 14 173-3
13 Y 1 10 173-1
14 Y 1 13 174-2
15 Y 1 11 174-1
16 Y 1 12 173-2
Cycle Input
17 Y 1 16 171-4
18 Y 1 23 172-7
19 Y 1 17 172-4
20 Y 1 22 171-7
21 Y 1 18 171-5
22 Y 1 21 172-6
23 Y 1 19 172-5
24 Y 1 20 171-6
25 Y 1 24 173-4
26 Y 1 31 174-7
27 Y 1 35 174-4
28 Y 1 30 173-7
29 Y 1 26 173-5
30 Y 1 29 174-6
31 Y 1 27 174-5
32 Y 1 28 173-6
Cycle Input
33 Y 1 32 171-8
34 Y 1 39 172-11
35 Y 1 33 172-8
36 Y 1 38 171-11
37 Y 1 34 171-9
38 Y 1 37 172-10
39 Y 1 35 172-9
40 Y 1 36 171-8
41 Y 1 40 173-8
42 Y 1 47 174-11
43 Y 1 42 173-8
44 Y 1 46 173-11
45 Y 1 42 173-9
46 Y 1 45 174-10
47 Y 1 43 174-9
48 Y 1 44 173-10
Cycle Input
49 Y 1 48 171-12
50 Y 1 55 172-15
51 Y 1 49 172-12
52 Y 1 54 171-15
53 Y 1 50 171-13
54 Y 1 53 172-14
55 Y 1 51 172-13
56 Y 1 52 171-14
Table 6
Cycle Input Output
57 Y 1 56 173-12 (Y 1 0,Y 1 8) 171-0,173-0
58 Y 1 63 174-15 (Y 1 16,Y 1 24) 171-4,173-4
59 Y 1 57 174-12 (Y 1 32,Y 1 40) 171-8,173-8
60 Y 1 62 173-15 (Y 1 49,Y 1 56) 171-12,173-12
61 Y 1 58 173-13 (Y 1 0,Y 1 3) 171-0,173-0
62 Y 1 61 174-14 (Y 1 16,Y 1 24) 171-4,173-4
63 Y 1 59 174-13 (Y 1 32,Y 1 40) 171-8,173-8
64 Y 2 60 173-14 (Y 1 49,Y 1 55) 171-12,173-12
65 Y 2 0 171-0 (Y 1 1,Y 1 9) 172-0,174-0
66 Y 2 7 173-12 (Y 1 17,Y 1 25) 172-4,174-4
67 Y 2 1 173-0 (Y 1 33,Y 1 41) 172-8,174-8
68 Y 2 6 171-2 (Y 1 49,Y 1 57) 172-12,174-12
69 Y 2 2 171-4 (Y 1 1,Y 1 9) 172-0,174-0
70 Y 2 5 173-8 (Y 1 17,Y 1 25) 172-4,174-4
71 Y 2 3 173-4 (Y 2 33,Y 1 41) 172-8,174-8
72 Y 2 4 171-8 (Y 2 49,Y 1 57) 172-12,174-12
73 Y 2 8 172-0 (Y 1 2,Y 1 10) 171-1,173-1
Cycle Input Output
74 Y 2 15 174-12 (Y 1 18,Y 1 26) 171-5,173-5
75 Y29 174-0 (Y 1 34,Y 1 42) 171-9,173-9
76 Y 2 24 172-12 (Y 1 50,Y 1 38) 171-13,173-13
77 Y 2 10 172-4 (Y 1 2,Y 1 10) 171-1,173-1
78 Y 2 13 174-8 (Y 1 18,Y 1 26) 171-5,173-5
79 Y 2 11 174-4 (Y 1 34,Y 1 42) 171-9,173-9
80 Y 2 12 172-8 (Y 1 50,Y 1 58) 171-13,173-13
113 Y 2 40 172-2 (Y 1 7,Y 1 15) 172-3,174-3
114 Y 2 47 174-14 (Y 1 23,Y 1 31) 172-7,174-7
115 Y 2 41 174-2 (Y 1 391 47) 172-11,174-11
116 Y 2 46 172-14 (Y 1 55,Y 1 63) 172-15,174-15
117 Y 2 42 172-6 (Y 1 7,Y 1 15) 172-3,174-3
118 Y 2 45 174-10 (Y 1 23,Y 1 31) 172-7,174-7
119 Y 2 43 174-6 (Y 1 39,Y 1 47) 172-11,174-11
120 Y 2 44 172-10 (Y 1 55,Y 1 53) 172-15,174-15
Table 7
Cycle Input Output
121 Y 2 56 172-3 (Y 2 0,Y 2 8) 171-0,172-0
122 Y 2 63 174-15 (Y 2 16,Y 2 24) 171-1,172-1
123 Y 2 57 174-3 (Y 2 32,Y 2 40) 171-2,172-2
124 Y 2 62 172-15 (Y 2 48,Y 2 56) 171-3,172-3
125 Y 2 58 172-7 (Y 20,Y 2 8) 171-0,172-0
126 Y 2 61 174-11 (Y 2 16,Y 2 24) 171-1,172-1
127 Y 2 59 174-7 (Y 2 32,Y 2 40) 171-2,172-2
128 Y 2 60 172-11 (Y 2 49,Y 2 56) 171-3,172-3
129 Y 3 0 171-0 (Y 2 1,Y 2 9) 173-0,174-0
130 Y 3 7 172-3 (Y 2 17,Y 2 25) 173-1,174-1
131 Y 3 1 171-1 (Y 2 33,Y 2 41) 173-2,174-2
132 Y 3 6 171-3 (Y 2 49,Y 2 57) 173-3,174-3
133 Y 3 2 171-1 (Y 2 1,Y 2 9) 173-0,174-0
134 Y 3 5 172-2 (Y 2 17,Y 2 25) 173-1,174-1
135 Y 3 3 172-1 (Y 2 33,Y 2 41) 173-2,174-2
136 Y 3 4 171-2 (Y 2 49,Y 2 57) 173-3,174-3
137 Y 3 8 173-0 (Y 2 2,Y 2 10) 171-4,172-4
Cycle Input Output
138 Y 3 15 174-3 (Y 2 16,Y 2 26) 171-5,172-5
139 Y 3 9 174-0 (Y 2 34,Y 2 42) 171-6,172-6
140 Y 3 14 173-3 (Y 2 50,Y 2 58) 171-7,172-7
141 Y 3 10 173-1 (Y 2 2,Y 2 10) 171-4,172-4
142 Y 3 13 174-2 (Y 2 18,Y 2 26) 171-5,172-5
143 Y 3 11 174-1 (Y 2 34,Y 2 42) 171-6,172-6
144 Y 3 12 173-2 (Y 2 50,Y 2 58) 171-7,172-7
177 Y 3 40 173-8 (Y 2 7,Y 2 57) 173-12,174-12
178 Y 3 47 174-11 (Y 2 23,Y 2 31) 173-13,174-13
179 Y 3 41 174-8 (Y 2 39,Y 2 47) 173-13,174-14
180 Y 3 46 173-11 (Y 2 55,Y 2 63) 173-15,174-15
181 Y 3 42 173-9 (Y 2 7,Y 2 15) 173-12 174-12
182 Y 3 45 174-10 (Y 2 23,Y 2 31) 173-13,174-13
183 Y 3 43 174-9 (Y 2 39,Y 2 47) 173-14,174-14
184 Y 3 44 173-10 (Y 2 55,Y 2 63) 173-15,174-15

Claims (10)

1.一种转置存储器,其根据第一个序列顺序输入矩阵元素,及根据第二个序列顺序输出矩阵元素至第一或第二型式转换电路,其特征在于,该存储器装置包含:1. A transposition memory, which inputs matrix elements according to a first sequence order, and outputs matrix elements to a first or second type conversion circuit according to a second sequence order, characterized in that the memory device comprises: 至少四个双端口存储器,每一个双端口存储器用以存储四分之一个矩阵的元素;At least four dual-port memories, each dual-port memory is used to store a quarter of the elements of the matrix; 一个第一个计数器,作为所述的第一型式转换电路,并用来将收到的每一个矩阵元素写入对应于所述的元素的矩阵象限的特定之存储器的特定之存储元件;且作为所述的第二型式转换电路,并用来将收到的每一个矩阵元素写入对应于所述的元素的矩阵单或双行与栏的特定存储器的特定存储元件;及a first counter, as said first type conversion circuit, and for writing each matrix element received into a specific storage element of a specific memory corresponding to said matrix quadrant of said element; and as said The second type conversion circuit described above, and used to write each matrix element received into a specific storage element of a specific memory corresponding to the matrix single or double row and column of said element; and 一个第二个计数器,根据所述第二个序列顺序同时从所述存储器的特定存储元件读出所述元素对应的地址。A second counter reads the addresses corresponding to the elements from the specific storage elements of the memory at the same time according to the second sequence order. 2.如权利要求1所述的转置存储器,其特征在于,所述的第一型式转换电路为正离散余弦转换电路,而所述的第二型式转换电路为反离散余弦转换电路。2. The transpose memory according to claim 1, wherein the first type conversion circuit is a positive discrete cosine conversion circuit, and the second type conversion circuit is an inverse discrete cosine conversion circuit. 3.如权利要求1所述的转置存储器,其特征在于,所述的矩阵元素从所述的转置存储器以所述的第二个序列顺序的输出,是所述的转置存储器以所述的第一个序列顺序接收的所述的矩阵元素之转置。3. The transpose memory of claim 1, wherein the output of said matrix elements from said transpose memory in said second sequential order is said transpose memory in said The transpose of the elements of the matrix received in the first sequence order described above. 4.如权利要求1所述的转置存储器,其特征在于,所述的双端口存储器为双端口随机存取存储器。4. The transpose memory according to claim 1, wherein said dual-port memory is a dual-port random access memory. 5.如权利要求1所述的转置存储器,其特征在于,上述的转置存储器接收一序列的矩阵,且在一个特定型式的转换之下,所述的第一个计数器根据一个写入顺序,将每一收到的单数矩阵的所述矩阵元素写入单数的所述的存储器,并根据另一个为所述写入顺序的转置的写入顺序,将每一收到的双数矩阵的所述的矩阵元素写入偶数的所述的存储。5. The transpose memory of claim 1, wherein the above-mentioned transpose memory receives a sequence of matrices, and under a specific type of conversion, the first counter is based on a write order , write said matrix elements of each received singular matrix into singular said memory, and write each received double matrix according to another write order that is the transposition of said write order The elements of the matrix are written to even numbers of the storage. 6.如权利要求1所述的转置存储器,其特征在于,上述的第一及第二个计数器可以同时计数,且上述的第一及第二个计数器的上述的计数顺序除了相互同步,也与接收上述的转置存储器的矩阵元素同步,由此,上述的第二个计数器只在上述的第一个计数器把将要被上述的第二个计数器读取的上述的矩阵元素写入上述的特定的存储元件之后,才把存储在上述存储元件内上述的接收的矩阵元素以一特定的序列顺序读出。6. The transposition memory as claimed in claim 1, wherein said first and second counters can count at the same time, and the above-mentioned counting order of said first and second counters is not only synchronous with each other, but also Synchronous with receiving the matrix elements of the above-mentioned transpose memory, whereby the above-mentioned second counter writes the above-mentioned matrix elements to be read by the above-mentioned second counter into the above-mentioned specific After the storage element, the above-mentioned received matrix elements stored in the above-mentioned storage element are sequentially read out in a specific sequence. 7.如权利要求1所述的转置存储器,其特征在于,上述的第一个计数器平行地输出一个计数到上述四个存储器的写入地址输入端,另外,上述的第一个计数器输出适当的起动信号到上述的存储器,并根据上述第一个记数器的上述计数的指示,一次只将接收的矩阵元素存储到上述存储器的一个存储元件。7. The transposition memory as claimed in claim 1, characterized in that, said first counter outputs a count in parallel to the write address input terminals of said four memories, and in addition, said first counter outputs an appropriate and storing the received matrix elements in only one storage element of said memory at a time as indicated by said count of said first counter. 8.如权利要求1所述的转置存储器,其特征在于,还包含:8. The transpose memory as claimed in claim 1, further comprising: 一个第一个多工器,接收从第一个上述的存储器和第二个上述的存储器输出的矩阵元素并连接至第一和第二个资料输入,另外接收上述之第二个计数器的第一个选择控制信号并连接至该第一个多工器的选择控制输入;a first multiplexer receiving matrix elements output from the first aforementioned memory and the second aforementioned memory and connected to the first and second data inputs, and additionally receiving the first of the aforementioned second counters a selection control signal and connected to the selection control input of the first multiplexer; 一个第二个多工器,接收从第三个上述的存储器和上述的第一个多工器所选择输出的矩阵元素并连接至第一和第二个资料输入,另外接收上述的第二个计数器的第二个选择控制信号并连接至该第二个多工器的选择控制输入;a second multiplexer receiving matrix elements selected from the third aforementioned memory and the output of the first aforementioned multiplexer and connected to the first and second data inputs, additionally receiving the aforementioned second a second selection control signal of the counter and connected to the selection control input of the second multiplexer; 一个第三个多工器,接收从第一个上述的存储器和第四个上述的存储器输出的矩阵元素并连接至第一和第二个资料输入,另外接收上述的第二个计数器的上述的第二个选择控制信号并连接至该第三个多工器的选择控制输入;a third multiplexer receiving the matrix elements from the output of the first said memory and the fourth said memory and connected to the first and second data inputs, additionally receiving said said second counter a second selection control signal connected to the selection control input of the third multiplexer; 上述的第二个计数器平行地输出一个计数到上述的四个存储器的每一个,并输出特定的第一和第二个选择控制信号,根据上述的第二个计数器的上述的计数的指示,从其中两个上述的存储器的存储元件选择出矩阵元素。The above-mentioned second counter outputs a count in parallel to each of the above-mentioned four memories, and outputs specific first and second selection control signals, according to the indication of the above-mentioned count of the above-mentioned second counter, from The storage elements of two of the above-mentioned memories are selected as matrix elements. 9.如权利要求1所述的转置存储器,其特征在于,上述的第一个记数器将收到的上述的矩阵元素写到指定的双端口存储器的特定的存储元件,使得上述的第二个计数器只要能够输出单一个地址到上述的双端口存储器,就可以读出上述矩阵元素的元素对应地址。9. The transpose memory as claimed in claim 1, characterized in that, said first counter writes said matrix element received to a specific storage element of a designated dual-port memory, so that said first counter As long as the two counters can output a single address to the above-mentioned dual-port memory, the address corresponding to the element of the above-mentioned matrix element can be read out. 10.一个二维转换电路,其特征在于包含:10. A two-dimensional conversion circuit, characterized in that it comprises: 第一个和第二个一维转换电路,上述的第一个一维转换电路接收一个输入矩阵,并根据第一个序列顺序输出一个一维转换矩阵的元素;The first and second one-dimensional conversion circuits, the above-mentioned first one-dimensional conversion circuit receives an input matrix, and sequentially outputs elements of a one-dimensional conversion matrix according to the first sequence; 一个转置存储器,接收上述的第一个序列顺序的矩阵元素,并包含:A transpose memory that receives the matrix elements in the first sequence order above and contains: 四个双端口存储器,而每一个存储器用以存储四分之一个上述的第一个序列顺序的上述的矩阵的元素;four dual-port memories, and each memory is used to store a quarter of the elements of the above-mentioned matrix in the order of the first sequence; 一个第一个记数器,作为第一型式转换电路的上述的第一个一维转换电路,并用来将收到的每一个上述的第一个序列顺序的矩阵元素写入对应到上述的元素的矩阵象限的特定存储器的特定存储元件,并作为第二型式转换电路的上述的第二个一维转换电路,并用来将收到的每一个上述的第一个序列顺序的矩阵元素写入对应于上述元素的矩阵单或双行与栏的特定存储器的特定存储元件;及A first counter, which is used as the above-mentioned first one-dimensional conversion circuit of the first type conversion circuit, and is used to write each matrix element of the above-mentioned first sequence sequence received corresponding to the above-mentioned element The specific storage element of the specific memory of the matrix quadrant, and the above-mentioned second one-dimensional conversion circuit as the second type conversion circuit, and is used to write each received matrix element of the above-mentioned first sequence order into the corresponding specific storage elements of specific memory in single or double rows and columns of the matrix of said elements; and 一个第二个计数器,根据第二个序列顺序用来从上述存储器的特定存储元件读出上述一维转置矩阵之元素对应的地址;A second counter is used to read the address corresponding to the element of the above-mentioned one-dimensional transposition matrix from the specific storage element of the above-mentioned memory according to the second sequence order; 上述的第二个一维转换电路接收上述的第二个序列的元素,并输出一个二维转换矩阵。The above-mentioned second one-dimensional transformation circuit receives the elements of the above-mentioned second sequence, and outputs a two-dimensional transformation matrix.
CN94109115A 1994-08-19 1994-08-19 Transpose Memory for Discrete Cosine Transform/Inverse Discrete Cosine Transform Circuit Expired - Fee Related CN1076838C (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN94109115A CN1076838C (en) 1994-08-19 1994-08-19 Transpose Memory for Discrete Cosine Transform/Inverse Discrete Cosine Transform Circuit

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN94109115A CN1076838C (en) 1994-08-19 1994-08-19 Transpose Memory for Discrete Cosine Transform/Inverse Discrete Cosine Transform Circuit

Publications (2)

Publication Number Publication Date
CN1122026A CN1122026A (en) 1996-05-08
CN1076838C true CN1076838C (en) 2001-12-26

Family

ID=5033774

Family Applications (1)

Application Number Title Priority Date Filing Date
CN94109115A Expired - Fee Related CN1076838C (en) 1994-08-19 1994-08-19 Transpose Memory for Discrete Cosine Transform/Inverse Discrete Cosine Transform Circuit

Country Status (1)

Country Link
CN (1) CN1076838C (en)

Families Citing this family (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3971135B2 (en) * 2001-07-11 2007-09-05 株式会社テクノマセマティカル DCT matrix decomposition method and DCT apparatus
US7082450B2 (en) * 2001-08-30 2006-07-25 Nokia Corporation Implementation of a transform and of a subsequent quantization
CN100366068C (en) * 2004-03-09 2008-01-30 北京中星微电子有限公司 A storage space saved storage processing method
CN109408117B (en) * 2018-10-08 2021-01-26 京东方科技集团股份有限公司 Matrix transposing device and method, and display device
US10831507B2 (en) 2018-11-21 2020-11-10 SambaNova Systems, Inc. Configuration load of a reconfigurable data processor
US11188497B2 (en) 2018-11-21 2021-11-30 SambaNova Systems, Inc. Configuration unload of a reconfigurable data processor
US10698853B1 (en) 2019-01-03 2020-06-30 SambaNova Systems, Inc. Virtualization of a reconfigurable data processor
US10768899B2 (en) * 2019-01-29 2020-09-08 SambaNova Systems, Inc. Matrix normal/transpose read and a reconfigurable data processor including same
US11386038B2 (en) 2019-05-09 2022-07-12 SambaNova Systems, Inc. Control flow barrier and reconfigurable data processor
US11055141B2 (en) 2019-07-08 2021-07-06 SambaNova Systems, Inc. Quiesce reconfigurable data processor
CN110826711B (en) * 2019-10-29 2022-04-26 中昊芯英(杭州)科技有限公司 Matrix processing device, method and equipment
US11809908B2 (en) 2020-07-07 2023-11-07 SambaNova Systems, Inc. Runtime virtualization of reconfigurable data flow resources
US11782729B2 (en) 2020-08-18 2023-10-10 SambaNova Systems, Inc. Runtime patching of configuration files
US20230297270A1 (en) * 2020-09-27 2023-09-21 Cambricon (Xi'an) Semiconductor Co., Ltd. Data processing device, integrated circuit chip, device, and implementation method therefor
US11409540B1 (en) 2021-07-16 2022-08-09 SambaNova Systems, Inc. Routing circuits for defect repair for a reconfigurable data processor
US11556494B1 (en) 2021-07-16 2023-01-17 SambaNova Systems, Inc. Defect repair for a reconfigurable data processor for homogeneous subarrays
US11327771B1 (en) 2021-07-16 2022-05-10 SambaNova Systems, Inc. Defect repair circuits for a reconfigurable data processor

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN87103462A (en) * 1986-05-12 1987-11-25 菲利浦光灯制造公司 Discrete cosine conversion device
CN87104093A (en) * 1986-06-06 1987-12-16 汤姆生无线电报总公司 The calculation element of one dimension cosine transform and the image coding device and the decoding device that comprise this calculation element
US5181183A (en) * 1990-01-17 1993-01-19 Nec Corporation Discrete cosine transform circuit suitable for integrated circuit implementation
US5291429A (en) * 1991-03-08 1994-03-01 Fujitsu Limited Circuit for matrix calculation of discrete cosine transformation

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN87103462A (en) * 1986-05-12 1987-11-25 菲利浦光灯制造公司 Discrete cosine conversion device
CN87104093A (en) * 1986-06-06 1987-12-16 汤姆生无线电报总公司 The calculation element of one dimension cosine transform and the image coding device and the decoding device that comprise this calculation element
US5181183A (en) * 1990-01-17 1993-01-19 Nec Corporation Discrete cosine transform circuit suitable for integrated circuit implementation
US5291429A (en) * 1991-03-08 1994-03-01 Fujitsu Limited Circuit for matrix calculation of discrete cosine transformation

Also Published As

Publication number Publication date
CN1122026A (en) 1996-05-08

Similar Documents

Publication Publication Date Title
CN1076838C (en) Transpose Memory for Discrete Cosine Transform/Inverse Discrete Cosine Transform Circuit
US5481487A (en) Transpose memory for DCT/IDCT circuit
CA1277036C (en) Orthogonal transform processor
CA1290854C (en) Two-dimensional discrete cosine transform processor
Lawrie Access and alignment of data in an array processor
Shen et al. A unified 4/8/16/32-point integer IDCT architecture for multiple video coding standards
EP1016970A2 (en) A memory architecture for parallel data access along any given dimension of an n-dimensional rectangular data array
CN103369326B (en) Be suitable to the transform coder of high-performance video coding standard HEVC
CN1009034B (en) Discrete cosine conversion device
Mehlhorn et al. Area—Time optimal VLSI integer multiplier with minimum computation time
US5508538A (en) Signal processing applications of massively parallel charge domain computing devices
Robert Block LU decompositon of a band matrix on a systolic array
Zhang et al. Hardware architecture design of block-matching and 3D-filtering denoising algorithm
Hsiao et al. A cost-efficient and fully-pipelinable architecture for DCT/IDCT
US8572148B1 (en) Data reorganizer for fourier transformation of parallel data streams
Mohanty et al. New scan method and pipeline architecture for VLSI implementation of separable 2-D FIR filters without transposition
Nayak Bit-level systolic imlementation of 1-D and 2-D discrete wavelet transform
Aggoun et al. A parallel 3D DCT architecture for the compression of integral 3D images
KR20000013653A (en) Forward/reverse optimum integer cosine transform apparatus and method
Avila et al. A one gigaflop VLSI systolic processor
CN114554225B (en) Image encoding method, apparatus, device and computer readable medium
Demassieux et al. Orthogonal transforms
Coltuc et al. Scan-directional architectures
Zhang et al. A low area pipelined 2-D DCT architecture for JPEG encoder
Prots' ko et al. The Analysis of Models of the Block-cyclic Structures of the DCT-II core for the Synthesis of Fast Algorithms.

Legal Events

Date Code Title Description
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C06 Publication
PB01 Publication
C14 Grant of patent or utility model
GR01 Patent grant
C41 Transfer of patent application or patent right or utility model
TR01 Transfer of patent right

Effective date of registration: 20090109

Address after: Taipei City, Taiwan, China

Patentee after: Institute for Information Industry

Address before: Taiwan, China

Patentee before: Industrial Technology Research Institute

ASS Succession or assignment of patent right

Owner name: YULIN TECHNOLOGY CO., LTD.

Free format text: FORMER OWNER: INDUSTRIAL TECHNOLOGY RESEARCH INSTITUTE

Effective date: 20090109

C17 Cessation of patent right
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20011226

Termination date: 20100819