Working with Text or CSV Files

CSV or Text Files

Anyone who works with Text or CSV files will know this issue. The CSV file was generated by a machine or exported from a database. But it was opened and saved as a CSV file in Excel by the user to check the contents and in doing so the number value is saved as an exponential value and the required data is lost.

You can open a CSV file correctly in Excel by first opening excel and then using the data option and select “From Text/CSV” on the final step select text for each column to preserve the format. This link will guide you through all the steps depending on your excel version.

However your users will probably just double click on the CSV file which will open the file in excel and the damage is done.

I have created a solution to this issue with the following steps.

  1. Create a batch file which will open the Excel Macro and pass the name of the file as a parameter. To ensure the user always uses this method you can right click on a csv file and select open with – then choose default program and point to your batch file.
  2. In Excel I designed a userform to give the user some choices on the filetype and location.
  3. In the workbook I added a macro to load the file into an array and then format the array size as text in excel and paste the values.
  4. Non Standard Characters – I added another function to the userform to allow you to find non standard characters in the file, these characters caused issues when the files were uploaded to other system. This function is based on my character set but you can amend or add to the characters as required.
Csv or Text Files

You can download a copy of the excel file and a sample batch file form this link. All the source for the macro is in the workbook feel free to customise it for your own environment.

Text or CSV Files – How it works

Batch File

The batch file copied the development copy of the macro to a staging location. In the sample batch file below it copies the file from c:\development to c:\csvopener. If the folder c:\csvopener doesnt exist the batch file will create it. To use this you will need to either make the development folder or change it to your desired location. The batch file copies the file passed as a parameter to the batch file to the file echo c:\csvopener\filetoopen.txt. This is used by the macro to populate the data automatically. If the macro workbook is opened directly the file dialog will be displayed for the user to pick the file to import.

@ECHO on
C:
IF EXIST C:\CSVOpener GOTO BOEXISTS
c:
cd\
MD CSVOpener
:BOEXISTS
rem preserve the development copy by copying the file into a new location
copy "C:\development\csvopener.xlsm" "c:\CSVOpener\csvopener.xlsm" 
echo %1 >c:\csvopener\filetoopen.txt
Start "C:\Program Files (x86)\Microsoft Office\Office16\Excel.EXE" "C:\CSVOpener\csvopener.xlsm"
rem Exit

Excel Macros

On Open

When the excel macro is opened the workbook_Open macro is executed. It checks if the file defined in the file c:\csvopener\filetoopen.txt exists. It extracts the file name and then deletes the filetoopen.txt to avoid the process repeating. The filename is then passed to the macro and parsed. In this use case the user had Regis and DTCC files, the regis files were delimited with a Pipe and usually had the naming pattern “R001”, otherwise it is assumed the file is a comma seperated file (csv).

Private Sub Workbook_Open()
Dim CmdRaw As Long
    Dim CmdLine As String
    Dim myParam As String
    
    CmdRaw = GetCommandLine
    CmdLine = CmdToSTr(CmdRaw)
    Dim tmpFileName, tmpFiletoRead
    tmpFileName = "c:\csvopener\filetoopen.txt"
    If Dir(tmpFileName) <> "" Then
        Open tmpFileName For Input As #1
        Do Until EOF(1)
            Line Input #1, tmpFiletoRead
            tmpFiletoRead = Replace(tmpFiletoRead, Chr(34), "")
        Loop
        Close #1
        Kill tmpFileName
    End If
    

    If Len(tmpFiletoRead) <> 0 Then
        If Dir(tmpFiletoRead) <> "" Then
            frmFileSelect.txtFileName = tmpFiletoRead
            
            If InStr(1, tmpFiletoRead, "R001", vbTextCompare) > 1 Then
                'this is a regis file use pipe delimiter
                'If you have other files with a pattern to define the delimiter you can add them here
                frmFileSelect.OptionButton2 = -1
                frmFileSelect.OptionButton1 = 0
            End If
            
            frmFileSelect.Show
        Else
            MsgBox "The file doesnt exist ??"
        End If
    Else
        frmFileSelect.Show
    End If
End Sub


Import CSV Button

In a manual import once the user presses the import button the following code is executed. The code checks the file to ensure it has a CRLF or reqrites the file to add the CRLF. IT then loads the files into an array and populates a new excel workbook with the array data and saves the data as text. This ensures that all the data is preserved.

Private Sub cmdImportFile_Click()
'
On Error GoTo ErrorHandler
'
Dim Row_Number, FileName As String
Dim LineItems() As String, tmpCheckSpecial As Boolean
Dim LineFromFile As String
Dim RowNumber
Dim ColumnNumbers, y, tmpSep
Dim tmpArray() As Variant
Dim tmpMaxRow, tmpMaxCol, fd As Office.FileDialog, i, tmpNewLine
Dim tmpFile
Dim tmpMonth

If OptionButton1.Value = True Then
    tmpSep = ","
ElseIf OptionButton2.Value = True Then
    tmpSep = ";"
Else
    tmpSep = "|"
End If

If chkSpecial Then
    tmpCheckSpecial = True
Else
    tmpCheckSpecial = False
End If

If txtFileName = "" Then
    Set fd = Application.FileDialog(msoFileDialogFilePicker)
    
    With fd
        .AllowMultiSelect = False
        .Title = "Please select the file."
    
        .Filters.Clear
        .Filters.Add "CSV Files", "*.csv"
    
        If .Show = True Then
            txtFileName = Dir(.SelectedItems(1))
        Else
            Exit Sub
        End If
    End With
End If


tmpMaxCol = 0
tmpMaxRow = 0
RowNumber = 0

Open txtFileName For Input As #1
'loop to size the array
Do Until EOF(1)
    Line Input #1, LineFromFile
    LineItems = Split(LineFromFile, tmpSep)
    ColumnNumbers = UBound(LineItems, 1)
    If tmpMaxCol < ColumnNumbers Then tmpMaxCol = ColumnNumbers
    RowNumber = RowNumber + 1
    tmpMaxRow = RowNumber
Loop
Close #1

If RowNumber = 1 Then
    'let check to ensure we dont have a problem with LF CF
    Open txtFileName For Input As #1
    'loop to size the array
    Line Input #1, LineFromFile
    tmpNewLine = Replace(LineFromFile, Chr(10), vbCrLf)
    Close #1
    
    Open txtFileName For Input As #1
    'loop to size the array
    Line Input #1, LineFromFile
    tmpNewLine = Replace(LineFromFile, Chr(10), vbCrLf)
    Close #1
    
    Open txtFileName For Output As #1  'Open file again, but for
    Print #1, tmpNewLine  'use print function to write contents, but with
    Close #1

    'now call the presvious function to get the number of rows
    tmpMaxCol = 0
    tmpMaxRow = 0
    RowNumber = 0
    
    Open txtFileName For Input As #1
    'loop to size the array
    Do Until EOF(1)
        Line Input #1, LineFromFile
        LineItems = Split(LineFromFile, tmpSep)
        ColumnNumbers = UBound(LineItems, 1)
        If tmpMaxCol < ColumnNumbers Then tmpMaxCol = ColumnNumbers
        RowNumber = RowNumber + 1
        tmpMaxRow = RowNumber
    Loop
    Close #1
End If

'open again and now size array
Open txtFileName For Input As #1

ReDim tmpArray(tmpMaxRow, tmpMaxCol)

tmpMaxCol = 0
tmpMaxRow = 0
RowNumber = 0


Do Until EOF(1)
    Line Input #1, LineFromFile
    LineItems = Split(LineFromFile, tmpSep)
    ColumnNumbers = UBound(LineItems, 1)
    If tmpMaxCol < ColumnNumbers Then tmpMaxCol = ColumnNumbers
    For y = 0 To ColumnNumbers
        tmpArray(RowNumber, y) = Replace(LineItems(y), Chr(34), "")
    Next
    Me.lblCaption.Caption = "Processing row " & i
    RowNumber = RowNumber + 1
    tmpMaxRow = RowNumber
Loop
Close #1


i = 1
y = 1


Dim WS As Workbook
Set WS = Workbooks.Add
Range("a1").Select

DoEvents
ActiveSheet.Range(ActiveSheet.Cells(1, 1), ActiveSheet.Cells(tmpMaxRow, tmpMaxCol + 1)).NumberFormat = "@"
ActiveSheet.Range(ActiveSheet.Cells(1, 1), ActiveSheet.Cells(tmpMaxRow, tmpMaxCol + 1)).Value = tmpArray

DoEvents

Range("a1").Select
If chkSpecial Then
    Do While i <= tmpMaxRow
       Do While y <= tmpMaxCol
            If IsSpecial(tmpArray(i, y)) Then
                Me.lblLastFound.Caption = "Column " & y & " in row " & i & " has special characters " & tmpArray(i, y)
                If Me.chkMark Then
                    ActiveCell.Offset(i, y).Interior.ColorIndex = 37
                End If
            End If
            Me.lblCaption.Caption = "Special Check row " & i
            DoEvents
            y = y + 1
        Loop
        i = i + 1
        y = 1
    Loop
End If


tmpFile = GetFilenameFromPath(txtFileName)

ActiveSheet.Range(ActiveSheet.Cells(1, 1), ActiveSheet.Cells(tmpMaxRow, tmpMaxCol + 1)).Columns.AutoFit

tmpMonth = Format(Date, "yyyy-mm")
'check if the backup directories exist
If Dir("c:\csvopener", vbDirectory) = "" Then
    MkDir "c:\csvopener"
End If
If Dir("c:\csvopener\" & tmpMonth, vbDirectory) = "" Then
    MkDir "c:\csvopener\" & tmpMonth
End If
ActiveWorkbook.SaveAs "c:\csvopener\" & tmpMonth & "\" & tmpFile, xlCSV

Unload Me

'close the macro but dont save the loaded content
Workbooks("csvopener.XLSm").Close SaveChanges:=False

Exit Sub

ErrorHandler:
'customise the error handler to advise the user of the error
MsgBox "Something didnt work !" & vbCrLf & "Have you selected the correct type DTCC opr REGIS ?" & vbCrLf & "Error Code " & Err.Number & " -  " & Err.Description

End Sub

Special Characters

The code also contains a function to check for special character and highlight the cell. In the function sCh takes each character and compares it to the pattern. You can encapsulate this in a RegEx but I found this easier to explain and for users to maintain. This function is not run unless selected by the user.

Public Function IsSpecial(s As Variant) As Long
    Dim L As Long, LL As Long
    Dim sCh As String
    IsSpecial = 0
    For L = 1 To Len(s)
        sCh = Mid(s, L, 1)
        If sCh Like "[0-9a-zA-Z]" Or sCh = Chr(34) Or sCh = "_" Or sCh = "&" Or sCh = "=" Or sCh = "*" Or sCh = "-" Or sCh = "/" Or sCh = " " Or sCh = "." Or sCh = ":" Or sCh = ";" Or sCh = "," Or sCh = ")" Or sCh = "(" Then
        Else
            IsSpecial = 1
            Exit Function
        End If
    Next L
End Function


Setting a Default Program for Text or CSV Files

  1. Right-click a .csv file and select Open with > Choose another app.
  2. In the window that appears:
    • Select More apps if needed.
    • Browse to the location of your csvopener.bat file (you might need to type *.bat in the file name field to see .bat files).
    • Check the option Always use this app to open .csv files.
  3. Click OK.


Leave a Reply

Your email address will not be published. Required fields are marked *