bash: shell table output to json

You know that sometimes it would be really great to format a shell output to a more versatile format like JSON or YAML you can process with jq instead of writing long pipes with text-processing.

Ah yeah, you could use python instead of bash ๐Ÿ˜‰

Before

$ virsh net-list                     
 Name                 State      Autostart     Persistent
----------------------------------------------------------
 default              active     yes           yes

After

$ virsh net-list | table-to-json 
[
    {
        "autostart": "yes", 
        "name": "default", 
        "persistent": "yes", 
        "state": "active"
    }
]
$ virsh net-list | bin/table-to-json  | jq -r ".[0].name"
default

table-to-json

#!/usr/bin/env python

import sys
import re
import json

def parse_line(line):
    if line.find("\t") == -1:
        line = re.sub(r'\s+', '\t', line , flags=re.IGNORECASE)
    line = re.sub(r'^\s+', '', line , flags=re.IGNORECASE)
    line = re.sub(r'\s+$', '', line , flags=re.IGNORECASE)
    return [x for x in line.split("\t") if x]

lastparts = []
columns = None
data = []
for line in sys.stdin:
    parts = parse_line(line)
    if len(parts)>0:
        if len(parts)>0 and parts[0].startswith("-"):
            columns=[x.lower() for x in lastparts]
        elif len(parts)>0 and columns:
            data.append(dict(zip(columns, parts)))
    lastparts = parts

print json.dumps(data, sort_keys=True, indent=4)

Infojunk October 2018

Browser Extensions

Collaborative Coding

Focusing on IDEs. Web-based solutions are mostly ignored.

Linux

NodeJS

DevOps

AI/MachineLearning

AWS

Jame’s Path Selector is not as much powerful as jq but Amazon AWS probably chose it since it might be faster and probably query-selectors are a bit more sophisticated (?)

Python

Android

Security

Font Ligatures

Other

Color-Laser-Printer: Xerox Workcentre 6515DNI

I finally gave up my 15 year old color laser printer Konica magicolor 2530DL due a failed firmware update. I cannot find out how to reset firmware via USB-Stick since most links now go to 404 and DL model from is2003 does not have a parallel port anymore.USB and Ethernet is dead. Help is appreciated!

In opposite to the perfect Linux drivers the Windows Drivers became frustrating with a Windows 10 update: printing ended often just with one page and the rest hang on “processing”. Sometimes the printer misplaced the layout. Sadly, time for a new one!

One thing: Coming from an agency background print quality is of course king. I spared Samsung, HP and Co. They may have good quality but I heard several times that they are not long-living products and get very hot ๐Ÿ™

I was looking for

  • A Laser Printer (ever got a wet ink on paper?)
  • Good Printing Quality
  • A Duplex Printer
  • If Combi: A Good Scanner (CCD before CIS)
  • WLAN (CloudPrint)
  • Scan to USB
  • Copy
  • Linux Support

Final Round

Xerox Workcentre 6515DNI

If you search for a printer only: Xerox Versalink 500DN

Kyocera Ecosys M5526cdw

  • Printing @ 1200x1200dpi, text sucks because of toner diffusion
  • Eco friendly toner (finally!)
  • Mobile Print/Scan App looks OK though it may crash
  • Print Service for Android](https://play.google.com/store/apps/details?id=com.kyocera.print.service.plugin)

  • Released 08/2018

Samsung Xpress C1860FW

Brother DCP-L3550CDW

Trackback

PulseAudio: Mono-Sink Audio

Just in case your 10.000+ employee corporation doesn’t plug in the microphone-jack correctly and no one is allowed to ask questions (presentation-only).

Find the name of your audio sink by running

pacmd list-sinks | grep name:

Then run this command (taking care to remove the angled brackets):

pacmd load-module module-remap-sink sink_name=mono master=<name_of_audio > > sink_given_by_previous_command> channels=2 channel_map=mono,mono

or add the argument to pacmd to /etc/pulse/default.pa to have it run at startup.

Then in Sound Preferences choose “Mono” as the output, but remember to reduce volumes by half, since two channels are getting mixed into one, or else you’ll have distortion. To test, run:

speaker-test -c 2 -t sine

Same thing in a single command:

 pacmd load-module module-remap-sink sink_name=mono master=$(pacmd list-sinks | grep  -m 1 -oP 'name:\s<\K.*(?=>)') channels=2 channel_map=mono,mono

To remove the mono channel, just use:

pacmd unload-module module-remap-sink

Source: StackOverflow

Thanks to ondrejch!

git: reducing repository size (gc and destructive)

Garbage Collection (non-destrucive)

This espeicaly goes well with when removing a file added in the most recent unpushed commit. Git Garbage Collection automates some of those cleanup jobs:

I ran the following over my source folders:

for gitPath in $(find . -type d -name ".git" -readable -prune -exec realpath {} \; 2>/dev/null); do
    cd $gitPath
    echo ${gitPath}
    # git branch
    sizeBefore=$(du -sh . | cut -f1)
    git fetch -p
    git branch --format "%(refname:short)" | grep -vE "^(develop|master|staging)$"  |ย xargs git branch -D
    git gc --aggressive --prune=now
    sizeAfter=$(du -sh . |  cut -f1)
    echo "${sizeBefore} -โ€บ ${sizeAfter}"
done

Reduce repositoy size (destructive)

Atlassion has a pretty good article about reducing git repository size. Also take a look at Git Help: removing-sensitive-data-from-a-repository and the GIT BFG.

Trackback

AWS sync is not reliable!

While migrating from s3cmd to aws s3 cli i noticed that files donโ€™t yet sync when using aws cli.

I tested so far with different versions and they all revealed the same behavior:

  • python2.7-awscli1.9.7
  • python2.7-awscli1.15.47
  • python3.6-awscli1.15.47

Test-Setup

  1. Setup AWS CLI utility and configure your credentials
  2. Create a testing S3 bucket
  3. Setup some random files
    bash
    #create 10 radnom files รก 10MB
    for i in {1..10}; do dd if=/dev/urandom of=multi/part-$i.out bs=1MB count=10; done;
    # then copy the first 5 files over
    mkdir multi-changed
    cp -r multi/part-{1,2,3,4,5}.out multi-changed
    # and replace the content i 5 files
    for i in {6..10}; do dd if=/dev/urandom of=multi-changed/part-$i.out bs=1MB count=10; done;

Testing S3 sync with aws cli

Cleanup

$ aws s3 rm s3://l3testing/multi --recursive 

Inital sync

$ aws s3 sync multi s3://l3testing/multi
upload: multi/part-1.out to s3://l3testing/multi/part-1.out         
upload: multi/part-3.out to s3://l3testing/multi/part-3.out      
upload: multi/part-2.out to s3://l3testing/multi/part-2.out      
upload: multi/part-4.out to s3://l3testing/multi/part-4.out      
upload: multi/part-10.out to s3://l3testing/multi/part-10.out    
upload: multi/part-5.out to s3://l3testing/multi/part-5.out      
upload: multi/part-6.out to s3://l3testing/multi/part-6.out      
upload: multi/part-8.out to s3://l3testing/multi/part-8.out      
upload: multi/part-7.out to s3://l3testing/multi/part-7.out      
upload: multi/part-9.out to s3://l3testing/multi/part-9.out  

Update files

Only 5 files should now be uploaded. Timestamps for all 10 files should be changed.

$ aws s3 sync multi-changed/ s3://l3testing/multi/

ERROR: No files synced!

Testing with s3cmd

Cleanup

$ aws s3 rm s3://l3testing/multi --recursive 

Inital sync

$ s3cmd sync -v --check-md5 multi-changed/  s3://l3testing/multi/
s3cmd sync --delete-removed multi/  s3://l3testing/multi/ 
upload: 'multi/part-1.out' -> 's3://l3testing/multi/part-1.out'  [1 of 10]
 10000000 of 10000000   100% in    1s     5.12 MB/s  done
upload: 'multi/part-10.out' -> 's3://l3testing/multi/part-10.out'  [2 of 10]
 10000000 of 10000000   100% in    1s     7.54 MB/s  done
upload: 'multi/part-2.out' -> 's3://l3testing/multi/part-2.out'  [3 of 10]
 10000000 of 10000000   100% in    1s     8.60 MB/s  done
upload: 'multi/part-3.out' -> 's3://l3testing/multi/part-3.out'  [4 of 10]
 10000000 of 10000000   100% in    1s     7.17 MB/s  done
upload: 'multi/part-4.out' -> 's3://l3testing/multi/part-4.out'  [5 of 10]
 10000000 of 10000000   100% in    1s     7.72 MB/s  done
upload: 'multi/part-5.out' -> 's3://l3testing/multi/part-5.out'  [6 of 10]
 10000000 of 10000000   100% in    1s     8.19 MB/s  done
upload: 'multi/part-6.out' -> 's3://l3testing/multi/part-6.out'  [7 of 10]
 10000000 of 10000000   100% in    1s     7.60 MB/s  done
upload: 'multi/part-7.out' -> 's3://l3testing/multi/part-7.out'  [8 of 10]
 10000000 of 10000000   100% in    1s     7.73 MB/s  done
upload: 'multi/part-8.out' -> 's3://l3testing/multi/part-8.out'  [9 of 10]
 10000000 of 10000000   100% in    1s     7.52 MB/s  done
upload: 'multi/part-9.out' -> 's3://l3testing/multi/part-9.out'  [10 of 10]
 10000000 of 10000000   100% in    1s     8.31 MB/s  done
Done. Uploaded 100000000 bytes in 12.9 seconds, 7.38 MB/s.

Now update the files

Only 5 files should now be uploaded. Timestamps for all 10 files should be changed.

s3cmd sync  --delete-removed multi-changed/  s3://l3testing/multi/ 
upload: 'multi-changed/part-10.out' -> 's3://l3testing/multi/part-10.out'  [1 of 5]
 10000000 of 10000000   100% in    1s     5.97 MB/s  done
upload: 'multi-changed/part-6.out' -> 's3://l3testing/multi/part-6.out'  [2 of 5]
 10000000 of 10000000   100% in    1s     9.45 MB/s  done
upload: 'multi-changed/part-7.out' -> 's3://l3testing/multi/part-7.out'  [3 of 5]
 10000000 of 10000000   100% in    1s     9.18 MB/s  done
upload: 'multi-changed/part-8.out' -> 's3://l3testing/multi/part-8.out'  [4 of 5]
 10000000 of 10000000   100% in    1s     8.81 MB/s  done
upload: 'multi-changed/part-9.out' -> 's3://l3testing/multi/part-9.out'  [5 of 5]
 10000000 of 10000000   100% in    1s     8.79 MB/s  done
Done. Uploaded 50000000 bytes in 5.8 seconds, 8.17 MB/s.

Note: s3cmd also supports --dry-run.

SUCCESS: File content got updated…
WARNING: ..timestamps not

Analysis

Summary

Using --debug and aws s3api list-objects --bucket l3testing reveals that objects are stored as storage-class=STANDARD and do have their hashes.

Using aws cli --exact-timestamps, --delete and the payload_signing_enabled-option did change nothing.

Looking at the sync strategies (search for syncstrategy) within the aws cli sources reveals that they really shitty and as github issues reveal, that they are still doing a lot of unecessary things. Stackoverflow and Github reveals that there are several issues, also when syncing files over 5GB.

AWS Default sync fails MD5 #facepalm

We also get this when checking with s3cmd after an inital aws cli sync:

$ s3cmd sync -v --dry-run  multi-changed/  s3://l3testing/multi/
INFO: No cache file found, creating it.
INFO: Compiling list of local files...
INFO: Running stat() and reading/calculating MD5 values on 10 files, this may take some time...
INFO: Retrieving list of remote files for s3://l3testing/multi/ ...
INFO: Found 10 local files, 10 remote files
INFO: Verifying attributes...
INFO: disabled md5 check for part-1.out
INFO: disabled md5 check for part-10.out
INFO: disabled md5 check for part-2.out
INFO: disabled md5 check for part-3.out
INFO: disabled md5 check for part-4.out
INFO: disabled md5 check for part-5.out
INFO: disabled md5 check for part-6.out
INFO: disabled md5 check for part-7.out
INFO: disabled md5 check for part-8.out
INFO: disabled md5 check for part-9.out
INFO: Summary: 0 local files to upload, 0 files to remote copy, 0 remote files to delete
INFO: Done. Uploaded 0 bytes in 1.0 seconds, 0.00 B/s.

Also, wehen we use the s3cmd for initial sync, aws cli also wonโ€™t be able to do a sync.

AWS CLI internaly uses boto3 and aws s3api CreateMultipartUploadTaskInspecting for multipart-uploads. MD5 checksums for the consolidated uploaded parts are correctly transferred but somehow not stored.

Better solutions?

Tooling

Sure! My choice would be s4cmd which does the sync perfectly and is currently as fast as node-s3-cli. AWS CLI is currently as fast but well has faulty sync. node-s3-cli is baded on node and it’s said they still have some issues.

Performance

Activating the fast bucket option at AWS console just serves more reliable connections (less latency). This can range about [-7%, -1%, 1%, %1, %2, %3, 7%] speed improvements for some lcoations. I soemtiems can observe that when using too many connections it can hang a bit. Yet, I do not recommand to pay for that micro-option since multi-part uploads with files consolidated an the client side should be standard for HTTPS S3 API.

Further notes

AWS just does MD5 which should be sufficient for most files (yet I had md5 collisions in my life as developer!)

From their documentation

--payload_signing_enabled Refers to whether or not to SHA256 sign sigv4 payloads. By default, this is disabled for streaming uploads (UploadPart and PutObject) when using https.

Trackback

Infojunk September 2018

Kotlin

Python

Markdown Notestaking

Some notestaking apps you should give a try. At least Notion is very promising (yet you have to pay)

Note: In the end I go with Visual Studio Code and it’s Markdown Editors. Boostnote was the best free application (yet with bugs). Notion is the best paid app matching for my requirements ๐Ÿ˜‰

Web UX

Markdown WYSIWYM editors

Linux

  • ASSH Go wrapper around SSH with automated hops and gateways
  • USB Power Saving (Thinkpad)
  • [List of Linux Monitoring Tools])https://www.ubuntupit.com/most-comprehensive-list-of-linux-monitoring-tools-for-sysadmin/)
  • Tracktion7
  • OS Query use to query system resources by Facebook

Productivity

  • CodeStream – make working on soruce code collaborative (intelligent and live comments ๐Ÿ˜‰

Web Scraping and Acceptance Testing

Forget PhantomJS or Selenium! Nightmare is the shit if you wanna quickly scrap data or need a background browser. Of course Acceptance Testing should be done with WebDriveIO.

Android Apps

Other

Connecting to Checkpoint QVPN SXN in Linux

Prerequisites

Ensure you have received their E-Mail and following information:

  • VPN Certificate file (.p12)
  • Your VPN password
  • Your server username

Please use that information to replace placeholders in scripts found in this tutorial.

Installation script

You can either download from their website (crappy and frustrating) or get it directly via http://gateway-ip.

Look for a file called snx_install_linux**.sh

wget http://gateway-ip/**/snx_install_linux**.sh

Security: We have a look what is distributed and how running it will affect our system

$ cat snx_install_linux30-7075.sh | sed -e 's/^.*\(\x42\x5A.*\)/\1/g' >| tar -jtvf
-rwxr-xr-x builder/fw 3302196 2012-12-06 14:02 snx
-r--r--r-- builder/fw 747 2012-12-06 14:02 snx_uninstall.sh

Installation

$ sudo chmod +x snx_install_linux30-7075.sh
$ sudo ./snx_install_linux30-7075.sh

You may have some libraries missing since the client is still 32bit.

$ sudo ldd /usr/bin/snx | grep "not found"
libpam.so.0 => not found
libstdc++.so.5 => not found

So, here we would need some legacy architecture

$ sudo apt-get install libx11-6:i386 libstdc++5:i386 libpam0g:i386

Connect to VPN

$ snx -c path-to-key/rl_johnbarleycorn.p12 -g -s company.inetservices.com companyvpn
Check Point's Linux SNX
build 800007075
Please enter the certificate's password:
SNX authentication:
Please confirm the connection to gateway: companyvpn VPN Certificate
Root CA fingerprint: MELT ELSE FUN BLUE ONUS GORE GAD SWAM VAST CHAT YAWL FOUR
Do you accept? [y]es/[N]o:
y
SNX - connected.
Session parameters:
===================
Office Mode IP : 172.16.10.145
Timeout : 12 hours</username>

(exit code 0)

Debugging

$ ssh -vvv vq
debug1: Authentications that can continue: publickey,gssapi-keyex,gssapi-with-mic,password
[โ€ฆ]

Check what it did setup

$ ifconfig | grep -A 8 tunsnx
tunsnx: flags=4305<up,pointopoint,running,noarp,multicast> mtu 1500
inet 172.16.10.145 netmask 255.255.255.255 destination 172.16.10.144
inet6 fe80::ed2a:98f2:a47:8555 prefixlen 64 scopeid 0x20                    <link>
 unspec 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00 txqueuelen 100 (UNSPEC)
RX packets 0 bytes 0 (0.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 25 bytes 2252 (2.2 KB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0</up,pointopoint,running,noarp,multicast>

And for the routes:

$ routes | grep tunsnx                                           :(
Kernel-IP-Routentabelle
Ziel            Router          Genmask         Flags Metric Ref    Use Iface
10.7.5.0        0.0.0.0         255.255.255.0   U     0      0        0 tunsnx
10.7.6.0        0.0.0.0         255.255.255.0   U     0      0        0 tunsnx
10.8.4.0        0.0.0.0         255.255.254.0   U     0      0        0 tunsnx
10.8.6.0        0.0.0.0         255.255.255.0   U     0      0        0 tunsnx
10.14.14.15     0.0.0.0         255.255.255.255 UH    0      0        0 tunsnx
10.14.14.15     0.0.0.0         255.255.255.255 UH    2      0        0 tunsnx
10.200.1.12     0.0.0.0         255.255.255.255 UH    0      0        0 tunsnx
10.200.1.12     0.0.0.0         255.255.255.255 UH    2      0        0 tunsnx
10.200.13.0     0.0.0.0         255.255.255.0   U     0      0        0 tunsnx
10.200.13.0     0.0.0.0         255.255.255.0   U     2      0        0 tunsnx
10.200.14.0     0.0.0.0         255.255.255.0   U     0      0        0 tunsnx
10.200.14.0     0.0.0.0         255.255.255.0   U     2      0        0 tunsnx
10.200.28.9     0.0.0.0         255.255.255.255 UH    0      0        0 tunsnx
10.200.28.9     0.0.0.0         255.255.255.255 UH    2      0        0 tunsnx
10.200.29.0     0.0.0.0         255.255.255.0   U     0      0        0 tunsnx
10.200.29.0     0.0.0.0         255.255.255.0   U     2      0        0 tunsnx
172.16.10.68    0.0.0.0         255.255.255.255 UH    0      0        0 tunsnx

Automating connection

./snx-vpn-up:

#!/bin/bash

# trap ctrl-c and call ctrl_c()
trap ctrl_c INT

function ctrl_c() {
  snx -d
}

showroutes() {
  echo Routes:
  echo =======
  ip route | grep tunsnx
  if [ "$?" -ne 0 ]; then
    echo "Something failed. No routes? Try again."
    echo
    snx-vpn-down
    exit 1
  fi
}

ROUTES=$( ip route | grep tunsnx )
if [ ! -z "$ROUTES" ]; then
   echo "Already connected."
   echo
   showroutes
   exit 1
fi

echo "SNX - Connecting..."
echo 'PASSWORD' | snx -g -c path-to-key/rl_johnbarleycorn.p12  -s IP
sleep 1
showroutes
sleep 1
echo
echo /home/$( whoami )/snx.elg
echo =====
tail -n 1000 -f /home/$( whoami )/snx.elg

If this stops working at any point in future use expect

./snx-vpn-down:

#!/bin/bash
if [ -z "$( pgrep snx)" ]; then
  echo "SNX was not running."
  exit 1
fi

snx -d

Trackback

GitLab: checkout all available repositories

Generate a private token

https://<GITLAB-SERVER1>/profile/personal_access_tokens
https://<GITLAB-SERVER2>/profile/personal_access_tokens

Checkout a list of all available repositories

QUERY='.[] | .path_with_namespace + "\t" + .ssh_url_to_repo' # JQ Query
curl --request GET --header "PRIVATE-TOKEN: <PRIVATE-TOKEN>" "<GITLAB-SERVER1>/api/v4/projects?simple=true&per_page=65536" | jq -r $QUERY > repo.list
curl --request GET --header "PRIVATE-TOKEN: <PRIVATE-TOKEN>"" "<GITLAB-SERVER2>/api/v3/projects?simple=true&per_page=65536" | jq -r $QUERY >> repo.list

Create directories for repositories

cat repo.list | cut -f1 | xargs mkdir p-

Checkout projects (with GNU parallel)

parallel --colsep '\t' --jobs 4 -a repo.list git clone {2} {1}

Build list of git repositories

find -type d -name ".git"  | xargs realpath | xargs dirname > path.list  

Report repository branch or checkout branch

cat path.list | xargs -I{} sh -c "cd {}; echo {}; git branch"
cat path.list | xargs -I{} sh -c "cd {}; echo {}; git checkout master"
cat path.list | xargs -I{} sh -c "cd {}; echo {}; git checkout develop"

Note: when you are migrating repositoires you should use git clone --mirror

Update: try adding get all available repositories. if you donโ€™t get all projects and just get 404 youโ€™re fucked. Try creating the list from what you see browsing GitLab or try to get Admin-Access.