Announcement

Collapse
No announcement yet.

Let's made a PC-base metal detector with usb interface !!!

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Great work Aziz. This kind of flexibility is exactly what is needed to research superior pulse designs and get real data on what really works best. It will be interesting what you find with it. Maybe current designs are already near optimal, that often happens, engineers can have excellent instincts.

    Cheers,

    -SB

    Comment


    • Hello friends,

      I have now optimized the PI op-code processor further. Reduced many global registers which are now available to the other threads or application implementation. PI op-code decoding takes now 6 cpu clocks. If you have an idea how to reduce this further, please let me know.

      Below is the latest ISR routines of the op-code processor.

      Code:
       
      ; Timer/Counter2 Output Compare Match interrupt request service routine
      ISR_TIM2_COMP:
           out   TCCR2, PI_ZERO      ; (1) disable timer 2 (PI_ZERO:=ZH=0)
       
      ; External INT0 interrupt request service routine
      ISR_EXT_INT0:
      _next_cmd:
           ld    ZL, X+              ; (2) load command
       
           ; jump to command vector
           ijmp                      ; (2) + (2) for another rjmp in the command vector
      Here is some basic implementation of the op-code execution (after decoding).

      Code:
      //CMD_RETI: return from interrupt (continue with next cycle)
      Cmd_Reti:
           reti                      ; (4)
       
      ;------------------------------------------------
      
      //CMD_NOP: no operation
      Cmd_Nop:
           rjmp  _next_cmd           ; (2)
       
      ;------------------------------------------------
      //CMD_ST: store data to direct address, parameter: 8-bit Addr, 8-bit data
      Cmd_St:
           ld    PI_Addr,  X+        ; (2) get short address
           ld    PI_Param, X+        ; (2) get data
           st    Z, PI_Param         ; (2) IO = data
           rjmp  _next_cmd           ; (2)
      The NOP op-code execution implementation takes 2 cpu clocks. Without the interrupt latency (when NOP is between other commands), it takes for decoding and execution only 6+2 = 8 cpu clocks (0.5 µs). It can be used as a fixed active delay between I/O operations.
      The store op-code takes total 6+8 = 14 cpu clocks.

      Well, I didn't think of to "invent" a new mcu. But for the benefit of the flexibility, there is no other way to achieve this.

      Aziz

      Comment


      • Hi again,

        you won't believe me. But the code above can further be optimized. Have you any idea? (Tipp: ISR routine is very short, unused interrupt vectors give some place for coding. ISR routine can be implemented multiple times for each ISR vector.)

        Also the NOP op-code can be reduced to total 4 cpu clocks (decoding and execution).

        Aziz

        Comment


        • Hi again,

          now the interrupt latency time is reduced by 2 cpu clocks. There isn't any rjmp to the ISR routine anymore. The ISR routine is placed directly in the interrupt vector table! See the tricky solution below.

          Code:
           
          ;------------------------------------------------------------------------
          ;------------------------- Interrupt Vectors ----------------------------
          ;------------------------------------------------------------------------
          
          ;------------------------------- RESET ----------------------------------
          ; External Pin, Power-on Reset, Brown-out Reset, and Watchdog Reset
          .ORG 0              
               rjmp ISR_RESET
          ;-------------------- External Interrupt Request INT0 -------------------
          ; Triggers PI Cycle Begin
          .ORG INT0addr
          _next_cmd:
               ld    ZL, X+              ; (2) Load command
               ; Jump to command vector
               ijmp                      ; (2) + (2) for another rjmp in the command vector
          
          ;-------------------- Timer/Counter2 Output Compare Match ---------------
          ; Timer/Counter2 Output Compare Match interrupt request service routine
          .ORG OC2addr
               out   TCCR2, PI_ZERO      ; (1) disable timer 2 (PI_ZERO:=ZH=0)
               
          ; Note: the following commands will overwrite the OVF2addr interrupt vector
          ; OVF2addr vector is not used, same ISR routine as for INT0 follows 
               
               ld    ZL, X+              ; (2) Load command
               ; Jump to command vector
               ijmp                      ; (2) + (2) for another rjmp in the command vector
          ;------------------------------------------------------------------------
          I am wondering, how this tricky method can be implemented on a PIC micro. The Atmel's mcu architecture is quite efficient. Isn't it?


          Aziz

          Comment


          • Hi all,

            the more I look into my code, the more optimizations I find. Now the PI op-code decoding takes only 4 cpu clocks for all op-codes (instead of 6 cpu clocks). Some op-codes can be implemented directly on the op-code command vector table. The NOP op-code will take only 4 cpu clocks (decoding and 0 cpu clock execution time). It comes close to the hardware implemented 1 cpu clock.

            The solution for this very tricky method is as follows:
            Every op-code execution routine, which ends with rjmp _next_cmd will be replaced by the two ISR codes itself (it is a command prefetch method, which omits the rjmp _next_cmd and saves 2 cpu clocks):

            Code:
                 ;.. op-code routine..
                 ;
                 ; command prefetch (same as ISR routine at INT0)
                 ld    ZL, X+              ; (2) Load command
                 ijmp                      ; (2) Jump to command
            Now all PI op-code's have 2 cpu clocks less execution time.


            See the direct command vector implementation on the following section:

            Code:
             
            //CMD_NOP: no operation
            .ORG Cmd_Vector+CMD_NOP
                 ld    ZL, X+              ; (2) Load command
                 ijmp                      ; (2) Jump to command
             
            //CMD_RETI: return from interrupt
            .ORG Cmd_Vector+CMD_RETI
                 reti
            Multiple code direct op-code implementation (see NOP op-code) will produce op-code command coding gaps. Let's assume, the NOP is coded with op-code value of 2. Then the op-code value 3 cannot be used anymore for other op-codes (ijmp is there).

            Well, we can code up to 214 op-codes and this should be big enough I think. Remember, the ATmega16 has only 131 instructions. Some critical op-codes will therefore be direct vector coded to reduce the execution time.

            Now the PI op-code processor is becoming quite efficient. Do you see further optimizations???

            Aziz

            Comment


            • Hi friends,

              I am now focusing to implementing the PI op-code processor in detail (coding op-code instructions). Well, we can't afford to make an inefficient PI op-code processor simulation. Therefore every cpu clock we can save is a big progress.

              I have changed the implementation into a more portable code. So some global registers could be changed quickly. Some could not be (ZH:ZL).

              Here are global register names:

              Code:
              // r0..r14: application
              #define PI_tmp     r15       // temporary register for execution
              // r16..r23: application
              #define PI_Param1  r24       // Parameter 1
              #define PI_Param2  r25       // Parameter 2
              #define PI_Param   PI_Param1 // Single parameter register
              #define PI_PCL     XL        // PI op-code lo-program counter
              #define PI_PCH     XH        // PI op-code hi-program counter
              #define PI_PC      X         // PI op-code program counter
              #define PI_AddrL   ZL        // Lo-Address (must be ZL)
              #define PI_AddrH   ZH        // Hi-Address (must be ZH)
              #define PI_Addr    Z         // Address (must be Z)
              #define PI_ZERO    PI_AddrH  // Zero value register

              The soul of the op-code processor is the following macro with two instructions (often used in the code):

              Code:
               
              // op-code instruction prefetch macro
              .MACRO CMD_PREFETCH            ; (4) Prefetch next op-code and decode macro
                   ld    PI_AddrL,  PI_PC+   ; (2) Load command
                   ijmp                      ; (2) Jump to command
              .ENDMACRO
              When the op-code is a direct vector command implementation, the decoding takes only 4 cpu clocks. It will take 2 cpu clocks more, when the op-code is indirect vector command implemented (needs rjmp opcode_cmd_execution_ptr in the command vector table) .

              Now an example for an active wait op-code instruction follows:

              Code:
              //... interrupt vector table
              ;-------------------- External Interrupt Request INT0 -------------------
              ; Triggers Command Execution for Cycle Begin
              .ORG INT0addr
              _next_cmd:
                   CMD_PREFETCH              ; (4) Prefetch next op-code and decode macro
              ;-------------------- Timer/Counter2 Output Compare Match ---------------
              ; Timer/Counter2 Output Compare Match interrupt request service routine
              .ORG OC2addr
                   out   TCCR2, PI_ZERO      ; (1) disable timer 2 (PI_ZERO:=ZH=0)
               
              ; Note: the following commands will overwrite the OVF2addr interrupt vector
              ; OVF2addr vector is not used, same ISR routine as for INT0 follows 
                   CMD_PREFETCH              ; (4) Prefetch next op-code and decode macro
               
              //... other interrupt vectors
               
              ; PI op-code command vector table begins
              Cmd_Vector:
              //... somewhere below address 256
               
              //CMD_SWAIT: active short wait, parameter: 8-bit delay counter
              .ORG Cmd_Vector+CMD_SWAIT
                   rjmp Cmd_SWait         ; (2) jump to op-code execution routine
               
              //... other op-code vector jumps to the execution routine or direct vector implementations
               
              //... somewhere above address 256
               
              //... execution routines and other firmware code follows..
               
              //CMD_SWAIT: active short wait, parameter: 8-bit delay counter
              ; pure execution clocks: 1+n*3
              ; +2 for indirect vector coding at decoding
              ; +4 for op-code decoding
              ; total execution clocks: 7+n*3, (16 MHz: max. 48 µs) (n=0 -> n=256)
              Cmd_SWait:
                   ld    PI_Param,  PI_PC+   ; (2) get delay counter
              Cmd_SWaitLoop:
                   dec   PI_Param            ; (1) counter--
                   brne  Cmd_SWaitLoop       ; (2) for branch, (1) for loop end
               
                   ; execution ends here (cpu clocks not taken into the calculation)
                   CMD_PREFETCH              ; (4) Prefetch next op-code and decode macro
              I will give my best, to optimize the implementation further. Now you know how to build a micro processor op-code instruction in software.

              Aziz

              Comment


              • Hi all,

                after the firmware development is finished and when the results are promising huge benefits, the laptop software will very likely have a plug-in interface API. I will implement then the basic Windows DLL interface due to its simplicity and flexibility. Any programming language could then be used for the plug-in module development.

                A complete PI implementation will then consist of the universal firmware implementation (PI firmware implementation) and a laptop software plug-in for processing and handling (DLL). The DLL itself could provide the universal firmware directly in the DLL to hide the technology. I also can hide and capsulate my own software.

                Then everything possible could be made with this universal PI architecture. Any new features and search modes could be integrated easily.

                Just my thoughts, what all be could possible.

                Aziz

                Comment


                • Hi all,

                  I have tested the coil driver circuit and the INT0 mcu triggering (cycle trigger). It is working. The MOSFET gets quite hot because I am using long transmit pulses and low coil resistance in combination with high transmit pulse rates.

                  I can not see the high flyback voltage (no scope). But I can sense this with my wet fingers. The pre-amp seems to work. I just connected the output to the sound-card input. There is lots of noise... (coil not shielded yet and long unshielded signal wires crossing over the whole monster circuit).

                  The produced noise due to coil switching is excellent synchronous. It produces lots of spectral harmonics (multiple of switching frequency). So this can be eliminated totally. Couldn't observe timing jitter problems yet.

                  I will map the mcu signals step by step to the PGA, integrator and S&H stages soon (module not tested yet). A basic PI implementation is codeable in the mcu at the moment (needs flashing).

                  Aziz

                  Comment


                  • Whot is PGA?

                    Comment


                    • Originally posted by miki73 View Post
                      Whot is PGA?
                      Programmable Gain Amplifier. Digital control lines will modify the amplifier feedback resistor to achieve different amplification factors.

                      Comment


                      • Hi all,

                        I have measured the power consumption of the whole circuit with transmit coil firing. Transmit pulse width: 96 µs, DC/DC charge pump is running at 12 kHz.

                        560 mA @12V for 3000 PPS (pulses per second)
                        340 mA @12V for 1500 PPS
                        250 mA @12V for 750 PPS
                        205 mA @12V for 375 PPS

                        150 mA @12V for no transmitting but full active circuit operation.

                        Aziz

                        Comment


                        • HAPPY BIRTHDAY AZİZ..

                          nice mutlu yıllar dilerim.... nazar değmesin maaşallah sana
                          Attached Files

                          Comment


                          • Tarsos,

                            cok tessekkür ederim arkadas. Cok calistigim icin, unutmustum ben.

                            For my English speaking friends:
                            Thank you very much. I forgot my birthday due to too much working on this project.

                            Take care,
                            Aziz

                            Comment


                            • Hi all,

                              I have done some I/O pin port mapping to the hardware:

                              PortA:
                              Purely for A/D conversions and some input pins reserved (not assigned yet)

                              PortB:
                              PB0: Integrator clear signal (output)
                              PB1: Integrator sample/hold signal (output)
                              PB2..BP7: not assigned yet

                              PortC:
                              PC0..2: PGA control lines (output, PC2: reserved for finer amplification factors).
                              PC3: PGA gate (PGA on/off)
                              PC4..7: S&H gates (to the four channels)

                              PortD:
                              PD0: USART (RXD)
                              PD1: USART (TXD)
                              PD2: INT0 input (cycle trigger line)
                              PD3: free, not assigned yet
                              PD4: Red LED (for debugging purposes)
                              PD5: Orange LED (for debugging purposes)
                              PD6: Green LED (for debugging purposes)
                              PD7: Transmit pulse gate signal

                              There are more control lines available to make some design extensions possible.

                              Aziz

                              Comment


                              • Hello friends,

                                when the proof of concept is done (positive result), I will then extend the hardware design slightly. I need the option of direct sampling of PGA output (integration disabled), inverted PGA output and extending the integrator into two channels (possibility for integration and de-integration of signals).

                                The JFET on the integrator capacitor (J2 I think) will be controllable to deactivate the integration feature (capacitor disconnected). When the discharge signal is set, the integrator acts as an inverting amplifier. The discharge resistor of the integrator will be set equal to the input resistor to make -1 amplification (inverting of PGA output). Then the PGA output can directly be sampled from the S&H stages. I need only two I/O control lines more for this feature and will make much more design flexibility possible.


                                Aziz

                                Comment

                                Working...
                                X