Differences between revisions 13 and 14
Deletions are marked like this. Additions are marked like this.
Line 206: Line 206:
rake db:import model=Folder file=/tmp/folder.csv insert_only=false rollback=true RAILS_ENV=production
}}}
rake db:load_tree model=Folder file=/tmp/folder.csv insert_only=false rollback=true RAILS_ENV=production
}}}

= Export Tools =

== Rake Task db:export ==

 This is the pair of ''rake db:import''. The same DSL can be used to export data from the database into CSV file.
 Run for help:
  {{{
rake -D db:export
}}}

 An example:
  {{{
rake db:export csv_file=/tmp/a.csv config_file=script/v2_import/resource_sub_type_user_dsl.rb where='resource_type="User"'
}}}

== Rake Task db:load_tree ==

 This tool can be used to export tree type data, which can be imported by ''rake db:load_tree''.
 There is wrapper rake task to use the export module in ''lib/tasks/v2_export.rake''. Run for help:
  {{{
rake -D db:save_tree
}}}

 An example:
  {{{
rake db:save_tree model=Folder file=/tmp/WBS.txt
}}}

Migration External Data into V2

V2Master

Introduction

Import Tools

V2Import Rails Modul

Module Classes

  • A new module V2Import implemented in lib/v2_import.rb. The module has the following main classes:

    • Helper - methods used in other classes of the module
    • Reader::DSLReader - class to read and check the import DSL configuration file
    • Reader::CSVReader - read the input CSV file into a hash
    • Import - loads data into the database based on the result of the two previous class

Module Usage

  • The module can be used from the Rails console as described in this chapter or the Rake wrapper can be used as described in chapter "V2Import Rake Task" below.

    • Staring the Rails console:
      $ rails c production  # The development system will be used if "production" omitted
      > ActiveRecord::Base.logger.level = 1 # Switch off SQL logging to allow to see error messages easily
      > Authorization.current_user = User.find(2) # Set the appropriate user to avoid authorization errors
    The module should be initialized first passing the filenames (CSV file, import configuration DSL file) and CSV options:
    • import = V2Import::Import.new csv_file, config_file, csv_options

    csv_options are option of the standard Ruby CSV.new method. They can be used to change the default separators (col_sep: ',', row_sep: 'auto', quote_char: '"'). The module initialization exits with exception on the first configuration error found. The initialzed module can be queried to show the internal data formats:

    • y import.config_hash # show the internal format of the import configuration, read from the DSL file
      y import.csv_hash    # show the internal format of the input CSV file

    The load method of the initialized module should be called to load the data into the database:

    • import.load(load_options)

    load_options:

    • insert_only: default = true, all records are new.
    • rollback: default = false, rollback at the end, good for testing the input.

    The load method doesn't exits on errors, but rollbacks every changes if data error occurred. The 2 calls can be chained of course:

    • V2Import::Import.new(csv_file, config_file, csv_options).load(load_options)

DSL File Format

  • The file format explained on two samples. Pre-defined DSL files can be found in script/v2_import/. The first format is a document/revision/notes/classification import:

    • # encoding: utf-8                                # needed only to allow syntax check with ruby -c
      v2_import do                                     # outline block always v2_import
        date_format "%d.%m.%y"                         # optional input date format, defaults accepted: [YY]YY[-./]MM[-./]DD[-./]
        root_path "/var/tmp"                           # optional root path for Paperclip::Attachments
        model Document do                              # the main model block, can contain defaults, attributes,
                                                       # classifications blocks
          keys :code                                   # list of table fields of a possible composed unique key to
                                                       # find database records to be updated (use symbols)
          defaults do                                  # default values for the containing model
                                                       # the right side can contain evaluable rails code
            resource_sub_type_id ResourceSubType.get_default_by_resource(Document).id 
            folder :foreign_key do                     # association can be only foreign_key to load default value
              attributes do                            # attributes block for the folder record
                code "WBS"                             # lookup attributes, the right side can contain evaluable rails
                                                       # code
              end
            end
          end
      
          attributes do                               # attributes block for the main record
            code "NÂș CARTA"                           # the right side must contain a reference to a CSV column
            revisions :required do                    # association is also an attribute, it can contain defaults,
                                                      # attributes, classifications blocks again
                                                      # required option means - that the record is created anyway
              keys :code                              # unique key within the revision records associated to the document
              defaults do
                code "1"
                number 1
                user :foreign_key,  do
                  attributes do
                    initials "AH /SXS"
                  end
                end
                resource_sub_type :foreign_key, :required do # this is a required association as foreign key
                  attributes do
                    code "Revision Type"
                  end
                end
              attributes do
                date "FECHA DE EMISION"               # the right side must contain a reference to a CSV column
                title1 "DESCRIPCION"
                notes do                              # not required -> created only if at least 1 non-default
                                                      # attribute filled (in this case the note attribute)
                  keys :issue_datetime
                  defaults do
                    issuer_id 1
                  end
                  attributes do
                    note "OBSERVACIONES"              # the right side must contain a reference to a CSV column
                    issue_datetime "FECHA DE EMISION" # the same CSV column can be used more than once
                  end
                end
              end
              classifications do                      # classification of the revision, see description below
                # "zero level" classification, the value in CSV column will go into one of the value fields of object_classifications depending on data_tape.
                REASON  "Reason for Issue", lookup: nil 
              end
            end
          end
      
          # all classifications defined as
          # - code of the classification class (list)
          # - CSV column reference for the lookup value
          # - lookup option for the lookup field to find the classification
          #   - "lookup: nil" means a "zero level" classification
          # Note, that a classification class can appear more than 1 time to add more
          # classification in the same class.
          classifications do                    # classification of the document      
            ORG "DESTINATARIO", lookup: :name # Destination
            LOC "ESTADO",       lookup: :name # Status
          end
        end
        before_save do                               # optional callback before saving the instance
          self.... = ....
        end
        after_save do                                # optional callback after saving the instance
          self.... = ....
          save!
        end
      end
    The second example to load references (object_links) in a second step if both sides of the references are already loaded:
    • v2_import do
        model ObjectLink do
          attributes do
            left :foreign_key, :required, model: Document do  # define model, because polymorphic assoc.
              attributes do
                code "Left Code"
              end
            end
            right :foreign_key, :required, model: Document do # define model, because polymorphic assoc.
              attributes do
                code "Right Code"
              end
            end
          end
        end
      end

CSV File Format

  • The input file is a standard CSV file, with blank lines and comment lines (lines with starting '#') ignored.

Limitations

The current version (2014-08-14) does not support the following options:

  • multiple classifications in the same class
  • the to field in the case of 'zero-level' classification

  • classification data (from and to fields) on non 'zero-level' classification

Rake Task db:import

  • There is wrapper rake task to use the import module in lib/tasks/v2_import.rake. Run for help:

    • rake -D db:import
    An example:
    • rake db:import csv_file='/tmp/REGISTRO DE CARTAS 2014.csv' config_file='config/sites/astaldi/site/import/import_cfg_registro_de_cartas_dsl.rb' rollback=true RAILS_ENV=production

Rake Task db:load_tree

  • There is a rake task to load tree type data into the database, as Classification, Folder and Organisation. This tool deals however only with the fields code, name, description and parent_id. So this tool is not a complete solution for classifications.

    This tool understands two file formats: indented or non-indented. The parent-child relation is defined by TAB indentation in the indented format while the code of the parent is the first column in the case of the non-indented format.

    Otherwise the input file is a standard header-less CSV file, with blank lines and lines starting with character "#" ignored. But use "#" with caution, "#" will be determined after CSV read, so the following error can occur if ' " ' appears somewhere in the line: CSV::MalformedCSVError: Illegal quoting in line nn.

  • There is wrapper rake task to use the import module in lib/tasks/v2_import.rake. Run for help:

    • rake -D db:load_tree
    An example:
    • rake db:load_tree model=Folder file=/tmp/folder.csv insert_only=false rollback=true RAILS_ENV=production

Export Tools

Rake Task db:export

  • This is the pair of rake db:import. The same DSL can be used to export data from the database into CSV file. Run for help:

    • rake -D db:export
    An example:
    • rake db:export csv_file=/tmp/a.csv config_file=script/v2_import/resource_sub_type_user_dsl.rb where='resource_type="User"'

Rake Task db:load_tree

  • This tool can be used to export tree type data, which can be imported by rake db:load_tree. There is wrapper rake task to use the export module in lib/tasks/v2_export.rake. Run for help:

    • rake -D db:save_tree
    An example:
    • rake db:save_tree model=Folder file=/tmp/WBS.txt

V2MigrationTools (last edited 2015-04-23 09:24:58 by 172)

Copyright 2008-2014, SoftXS GmbH, Switzerland