Add Jenkins Tool Location with ansible

Resently I hat do setup a jenkins slave with ansible. I used some ansible roles from ansible-galaxy, but all did miss to set the tool location.

So I read some java doc and added the tool location config to the ruby-jinja2 template.

import hudson.model.AbstractBuild
import hudson.model.Node
import hudson.plugins.sshslaves.SSHLauncher
import hudson.plugins.sshslaves.verifiers.NonVerifyingKeyVerificationStrategy
import hudson.plugins.sshslaves.verifiers.SshHostKeyVerificationStrategy
import hudson.slaves.ComputerLauncher
import hudson.slaves.DumbSlave
import hudson.util.VariableResolver
import jenkins.model.Jenkins
import hudson.slaves.*
import hudson.slaves.EnvironmentVariablesNodeProperty.Entry
 import hudson.tools.ToolLocationNodeProperty
import hudson.plugins.git.GitTool
import hudson.plugins.git.GitTool$DescriptorImpl
import hudson.tools.ToolInstallation

List env = new ArrayList();
{% for item in (slave_environments | dict3items) %}
env.add(new Entry("{{ item.key }}", "{{ item.value }}"))
{% endfor %}

List toolLoc = new ArrayList();
{% for item in (slave_tool_locations ) %}
toolLoc.add(new ToolLocationNodeProperty.ToolLocation(ToolInstallation.all().get(hudson.plugins.git.GitTool$DescriptorImpl.class), "{{ item.name }}", "{{ item.home }}"))
{% endfor %}

EnvironmentVariablesNodeProperty envPro = new EnvironmentVariablesNodeProperty(env);
ToolLocationNodeProperty envToolLoc = new ToolLocationNodeProperty(toolLoc);

Jenkins jenkins = Jenkins.getInstance()

String nodeHostname = "{{ slave_linux_host }}"
String nodeCredentialID = "{{ slave_linux_jenkins_cred_id }}"
int nodePort = {{ slave_linux_ssh_port }}
SshHostKeyVerificationStrategy strategy = new NonVerifyingKeyVerificationStrategy()

ComputerLauncher nodeLauncher = new SSHLauncher( nodeHostname, nodePort,
nodeCredentialID, null, null, null, null, 31, 20, 10, strategy )

String nodeName = "{{ slave_agent_name }}"
String nodeRemoteFS = "{{ slave_linux_home }}"

Node node = new DumbSlave(nodeName, nodeRemoteFS, nodeLauncher)
node.setNumExecutors({{ slave_executors_num }})
node.setLabelString("{{ slave_linux_labels | join(' ') }}")
node.getNodeProperties().add(envPro)
node.getNodeProperties().add(envToolLoc)
{% if slave_mode_exclusive %}
node.setMode(Node.Mode.EXCLUSIVE)
{% else %}
node.setMode(Node.Mode.NORMAL)
{% endif %}
jenkins.addNode(node) 

If this did help you, consider to send me flower or a donut.

docker volume performance (portworx, nfs, bind mount)

I consider using NFS for shared volume on Docker Swarm. So I wonder how the performance of Docker NFS volumes may be.

Docker cache or delegated volumes have a huge effect on read an mostly outperform native read by far. On direct io this options do not have any effect.

The test is done on local host all to the same physical disk. Network latency is no considered. The Portworx is a standalone setup and might not represent the performance of a large cluster with replication.

docker run -it --rm -v jenkins_vol:/demo -v nfstest:/nfs ubuntu 
/bin/bash

root@686890cca159:/# set -o vi                                                                                                                                                                                                                                                                                                                                                                                                                                                   
root@686890cca159:/# dd if=/dev/zero of=/bindmount/test bs=1M count=1000 oflag=direct                                                                                                                                                                     
1000+0 records in                                                                                                                                                                                                                                        
1000+0 records out
1048576000 bytes (1.0 GB, 1000 MiB) copied, 4.80277 s, 218 MB/s
root@686890cca159:/# dd if=/dev/zero of=/portworxvolume/test bs=1M count=1000 oflag=direct                                                                                                                                                               
1000+0 records in                                                                                                                                                                                                                                        
1000+0 records out
1048576000 bytes (1.0 GB, 1000 MiB) copied, 6.08722 s, 172 MB/s
root@686890cca159:/# dd if=/dev/zero of=/nfs/test bs=1M count=1000 oflag=direct                                                                                                                                                                          
1000+0 records in                                                                                                                                                                                                                                        
1000+0 records out
1048576000 bytes (1.0 GB, 1000 MiB) copied, 15.3807 s, 68.2 MB/s

But still the NFS seem to be significantly slower. NFS tuning did not change write performance.

Index Oracle RDBMS in Solr using Apache Spark

After I wrote multiprocessing python script which can be scaled with multiple docker containers to load billions of records from Oracle RDBMS to Solr Cloud I started to think of simpler solutions.

The basic idea is to handle all multiprocessing and scaling aspects by spark. As Spark is able to partition jdbc datasources out of the box the prove was needed that data can be saved in solr. Spakr-Solr provides the needed functionality.

Environment Preparation:

  • Setup Solr Cloud on Docker (one node in cloud mode is sufficient, create a collection “test”)
  • Run Oracle 18c on Docker
  • Run a Spark Cluster on Docker (I used this Spark images)

Copy the jdbc driver and the spark-solr driver

docker cp ojdbc8.jar spark-master:/
docker cp spark-solr-3.6.4-shaded.jar  spark-master:/
docker cp ojdbc8.jar spark-worker-1:/
docker cp spark-solr-3.6.4-shaded.jar  spark-worker-1:/

Start the Spark Shell (I use the pyspark) and load the date into a data frame

docker exec -it spark-master /spark/bin/pyspark --jars /ojdbc8.jar,/spark-solr-3.6.4-shaded.jar

empDF = spark.read \
    .format("jdbc") \
    .option("url", "jdbc:oracle:thin:test/test@//192.168.1.12:1521/demo01") \
    .option("dbtable", "test.emp") \
    .option("user", "test") \
    .option("password", "****") \
    .option("driver", "oracle.jdbc.driver.OracleDriver") \
    .load()
empDF.printSchema()
empDF.show()

Now save the date to Solr Cloud

empDF.write.format("solr").option("zkhost","192.168.1.12:9983").option("collection","test").mode("append").save()

And the date is in Solr Cloud.

On a Test with more date I had to add the Option .option(“commit_within”, “5000”)
I could not find explicit commit on solr-spark.

This solution should able scale well on a spark cluster by partitioning the data on the jdbc side. Transformation can be added on Spark. And such a solution would much less complex and better maintainable then a python multiprocessing solution (who ever used python multiprocessing may know what I mean).

If you have used similar setup please share your experiences.

Scale docker application with HAProxy

To be able to scale a docker service a instance can not have a static port definition for export.
HAProxy is a simple way to load balance traffic to diffrenet docker containers.

 
The concpet is that every container expose a port (can be the same for all), but not publish any ports.
HAProxy exports one port which gets distributed to all containers.

 
The application has to set the environment varialbe SERVICE_PORTS (for swarm). That’s all.

 
See this docker-comose (for stack):

  
version: '3'
services:
  app:
    build: .
    image: rit_app
    environment:
      - SERVICE_PORTS=80
    expose:
      - 80
    networks:
      - nw-rit
  ha:
    image: dockercloud/haproxy
    depends_on:
      - app
    environment:
      - BALANCE=roundrobin
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock
    ports:
      - "8102:80"
    networks:
      - nw-rit
    deploy:
      placement:
        constraints: [node.role == manager]
...

How check the jdbc client version in your database?

On the java side you can check the JDBC Client Version very easy.

java -jar ./mwhome/.../ojdbc7.jar -getversion
Oracle 12.1.0.2.0 JDBC 4.1 compiled with JDK7 on Mon_Jun_30_11:30:34_PDT_2014
#Default Connection Properties Resource
#Tue Apr 17 14:13:51 CEST 2018

But somethimes you may not have access to the application server (here Weblogic) or there are many version installed on the application side.
So the best way is to check in you Oracle Database who your client are.

SQL> select SID,NETWORK_SERVICE_BANNER,CLIENT_CHARSET,CLIENT_CONNECTION,CLIENT_VERSION,CLIENT_DRIVER from v$session_connect_info;
       396
Crypto-checksumming service for Linux: Version 12.2.0.1.0 - Production
Unknown                                  Heterogeneous 12.1.0.2.0                               jdbcthin

This is a reliable way to find how (jdbcthin, oci, ..) and with what JDBC Version the client are connecting.

Oracle: udpate a XML value in place with sql

Oracle sql provides a simple way to update values in a xml document by sql.
The “updatexml” function can udpate any fields in by xpath.

update SCOTT.DEMO" d
set
myval=updatexml(d.myval,'/myroot/mynode/mytime/text()',to_char(sysdate,'YYYY-MM-DD"T"HH24:Mi:SS.FF3"Z"'))
where id=1234;

to update multiple values at once:

udpatexml(xmltype,xpath1,rep1,xpath2,rep2,...)

Very handy to update large documents.

Login into Oracle Database without password / Autologin

Here I try to explain most simple and short way to login with a oracle wallen / without password.

create wallet:

mkstore -wrl -createCredential mydbservice.example.ch mkstore -wrl -listCredential

configure client side sqlnet.ora:

SQLNET.WALLET_OVERRIDE = TRUE
WALLET_LOCATION=(
SOURCE=(METHOD=FILE) (METHOD_DATA=(DIRECTORY=))
)

configure tns alias in tnsnames.ora:

mydbservice_al.example.ch =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = server1.example.ch)(PORT = 1521))
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = mydbservice.achat.example.ch)
)
)

If the oracle instant client is used the location of sqlnet.ora and tnsnames.ora needs to defined with the TNS_ADMIN environment variable .
example::

LD_LIBRARY_PATH=/usr/lib/oracle/12.2/client64/lib
TNS_ADMIN=/etc/oracle/tnsnames

Oracle on Docker (I): Backup / Restore Oracle RDBMS on Docker

Running Oracle on Docker is a great way for Testing Features or Development environments, specially for offline development on notebooks which require an oracle database.

If you have setup an oracle Image according oracle rdbms docker and choosen to create your db on docker volumes. And mounted backup1 volume at /backup.

docker volume create oracledb1
docker volume create backup1

Create an offline backup:

docker exec -u 0 -it ora122 bash
bash-4.2# chown oracle /backup

docker exec -ti ora122 sqlplus / as sysdba
SQL> alter database close;

Database altered.

docker exec -ti ora122 rman target /
backup database format '/backup/%U' tag=bk1;

Restore Database from Backup

docker exec -ti ora122 rman target /
RMAN> restore database from tag=BK1

RMAN> shutdown immediate

RMAN> startup mount

RMAN> alter database open resetlogs;

PL/SQL Unit Test for SQL geneartor

If PL/SQL is unit tested utPLSQL (http://utplsql.org/utPLSQL/) is probably the most majure solution to implement it.
Here I suggest a solution to check if generated SQL is executable.

We have a package P_SERCH whith a function generate_sql to generate dynamic sql.
Since there is no native execption checking in utPLSQL we execute the generated SQL in a block.
On a error we to a False comparsion “ut.expect(‘ok’).to_equal(‘error’)”.
If the statement executes without error we to a True comparsion “ut.expect(‘ok’).to_equal(‘error’)”

To lower the impact of the sql execution the statement is wrap in a count select.
If theh statement execution is still inappropriate for performance reasons this could be replaced as well with a DBMS_SQL parse.

  c := DBMS_SQL.OPEN_CURSOR;
  DBMS_SQL.PARSE(c, 'select * from multi_tab order by 1', DBMS_SQL.NATIVE);

Package Header and Body of the unit test:

create or replace package ut3_P_SEARCH as
  -- %suite(p_search)
  -- %suitepath(all.online)

  --%beforeall
  procedure global_setup;

  --%afterall
  procedure global_cleanup;

  --%beforeeach
  procedure test_setup;

  --%aftereach
  procedure test_cleanup;
 
  -- %test
  -- %displayname(generate solr sql)
  procedure gensql_starship;

end ut3_P_SEARCH;
/


create or replace package body ut3_P_SEARCH as
  procedure global_setup is
  begin
    null;
  end;

  procedure test_setup is
  begin
    null;
  end test_setup;

  procedure global_cleanup is
  begin
    null;
  end;

  procedure test_cleanup is
  begin
    null;
  end test_cleanup;


  procedure gensql_starship is
    v_sql                 varchar2(4000);
  begin
    -- retrieve generated sql for person class
    select scott.p_search.generate_sql('STARSHIP') into v_sql from dual;
    -- check if executable
    begin
      execute immediate 'select count(*) from ('||v_sql||')';
      ut.expect('ok').to_equal('ok');
    exception  when others then
      ut.expect('ok').to_equal('error');
    end;
    -- check tables 
    ut.expect(v_sql).to_be_like(a_mask=>'%'|| p_metadata_util.f_class_relational_name('PERSON') ||'%');
    -- check static attributes
    ut.expect(v_sql).to_be_like(a_mask=>'%"id"%');
  END gensql_person;


end ut3_P_SEARCH;
/

test execution
SQL> begin ut.run('scott.ut3_p_search.gensql_starship'); end;
  2  /
all
online
p_search
generate solr sql
Finished in .014174 seconds
1 tests, 0 failed, 0 errored, 0 disabled, 0 warning(s)

PL/SQL procedure successfully completed.

A good source for PL/SQL Testing and Error handling is Jacek Gebals blog (http://www.oraclethoughts.com/), he is also one of the main developer of utPLSQL (version 3)