Tuesday 29 May 2018

Python subprocess stdout and stderr

In our machine learning pipeline we have number of python subprocesses being started from main python process. Originally subprocess were started using using this simple procedure

import subprocess

def check_call(commands, shell=False):
    process = subprocess.Popen(commands, shell=shell, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
    stdout, stderr = process.communicate()
    if proc.returncode != 0:
        raise IOError(
                'Call failed, cmd=%s, return_code=%s, stderr=%s, stdout=%s' % (commands, process.returncode, stderr, stdout))
    return stdout, stderr

Drawback of this solution is that process.communicate() is blocking and subprocess standard output and error output is being accumulated into variable and available only after execution ends or fails. In our case that could be tens of minutes of even few hours. Issue was bit overcame by logging into file in subprocess, which gives some sort of real time feedback about it's proceeding. Also when subprocess hangs it will block main process forever and manual action might be necessary to unblock pipeline.

I've been looking form better solution which would make standard output lines available as soon as it is produced by subprocess. While it is old problem, I've found only prototypes and no ready to use code. So here comes my take on it based on unbutu's stackoverflow response with another one from Vadim Fint, merged together and pimped up a bit. Code is for Python 2.7 and it will probably work in Python 3 as well, but then you probably should go with proper Python 3 solution

Asynchronous version is lightweight as it does not need to spawn threads to read outputs and perform timeout checks. Using only standard python library, works for us very well.


import sys
import subprocess
import fcntl
import select
import os
import datetime

def execute(commands, sysout_handlers=[sys.stdout.write], syserr_handlers=[sys.stderr.write], shell=False, cwd=None, timeout_sec=None):
    """Executes commands using subprocess.Popen and asynchronous select and fcntl to handle subprocess standard output and standard error stream
    
    Examples:
        execute(['python2.7', 'random_print.py'])
        execute(['python2.7', 'random_print.py'], outlist.append, errlist.append)
        execute(['python2.7', 'random_print.py'], timeout_sec = 3)
        execute(['ls', '-l'], cwd = '/tmp')
    
    :param commands: list of strings passed to subprocess.Popen()
    :param sysout_handlers: list of handlers for standard output stream 
    :param syserr_handlers: list of handlers for standard error stream 
    :param shell: boolean, default false, default False, passed to subprocess.Popen()
    :param cwd: string, default None, passed to subprocess.Popen()
    :param timeout_sec: int, default None, number of seconds when terminate process. Cannot be less then 1 second
    :return: process returncode or None when timeout is reached. Allways check this value
    """
    
    def make_async(fd):
        # https://stackoverflow.com/a/7730201/190597
        '''add the O_NONBLOCK flag to a file descriptor'''
        fcntl.fcntl(fd, fcntl.F_SETFL, fcntl.fcntl(fd, fcntl.F_GETFL) | os.O_NONBLOCK)
    
    def read_async(fd):
        # https://stackoverflow.com/a/7730201/190597
        '''read some data from a file descriptor, ignoring EAGAIN errors'''
        try:
            return fd.read()
        except IOError, e:
            if e.errno != errno.EAGAIN:
                raise e
            else:
                return ''
    
    def handle_line(line, handlers):
        if isinstance(handlers, list):
            for handler in handlers:
                handler(line)
        else:
            handlers(line)

    def handle_output(fds, handler_map):
        for fd in fds:
            line = read_async(fd)
            handlers = handler_map[fd.fileno()]
            handle_line(line, handlers)

    process = subprocess.Popen(commands, stdout=subprocess.PIPE, stderr=subprocess.PIPE, close_fds=True, shell=shell, cwd=cwd)
    make_async(process.stdout)
    make_async(process.stderr)
    handler_map = { process.stdout.fileno() : sysout_handlers, process.stderr.fileno() : syserr_handlers }
    
    started = datetime.datetime.now()
    while True:
        rlist, wlist, xlist = select.select([process.stdout, process.stderr], [], [], 1) # 1 second wait for any output
        handle_output(rlist, handler_map)
        if process.poll() is not None:
            handle_output([process.stdout, process.stderr], handler_map)
            break
        if timeout_sec is not None and (datetime.datetime.now() - started).total_seconds() > timeout_sec:
            sys.stderr.write('Terminating subprocess %s on timeout %d seconds\n' % (commands, timeout_sec))
            process.kill()
            break

    return process.returncode

With default handlers printing subprocess outputs, usage is pretty simple...


returncode = execute['python2.7', '/somewhere/random_print.py'])
print(returncode)

In case you really want to accumulate outputs as you have some plans with it, like parsing it or passing as an input into some other process.


stdoutlist = []
stderrlist = []
returncode = execute(['python2.7', '/somewhere/random_print.py'],[stdoutlist.append],[stderrlist.append])
print(returncode)
print(stdoutlist)
print(stderrlist)

Or you can go ham and use multiple handlers for output lines


import StringIO
stdoutio = StringIO.StringIO()
stderrio = StringIO.StringIO()
stdout_lines = []
stderr_lines = []
stdout_handlers = [sys.stdout.write, stdout_lines.append, stdoutio.write]
stderr_handlers = [sys.stderr.write, stderr_lines.append, stderrio.write]
returncode = execute(['python2.7', '/somewhere/random_print.py'], stdout_handlers, stderr_handlers)
print(returncode)
print(stdout_lines)
print(stdoutio.getvalue())
print(stderr_lines)
print(stderrio.getvalue())

To be complete, here goes my random_print.py, I've been testing with


import sys
import time
import random

for i in range(50):
    if random.choice([True, False]):
        sys.stdout.write('random out-'+ str(i)+'\n')
        sys.stdout.flush()
    else:
        sys.stderr.write('random err-'+ str(i)+'\n')
        sys.stderr.flush()
    time.sleep(0.1)
    
raise Exception('You shall not pass')

Monday 16 October 2017

Scala type alias in Apache Spark

There is one feature in Scala that I see quite rarely - type aliases. While it can be easily misused and bring only confusion into Scala project, there is very good use-case for it in Apache Spark Jobs on RDDs

Spark job pipelines can be long and complex, which combined with type inference Scala, often leads to code which very hard to understand. Usually you can fight against this by making "check-points", where you assign interim result into value just to make it for reader easier to follow and understand long stream of transformations chained together.

For example, long code of combining two Maps into another was extracted into method. You can see that author took advantage of Scala type inference but recognised need to inform reader about type of result by creating long expressive name for it. Method name pretty much repeats the same idea of type explaining

val eligibleSegment2EligibleCountries = extractEligibleSegment2EligibleCountries(filteredSegment2PositiveCount, segmentCountry2PositiveCounts)
I might be a good idea to put explicit type of value into code. Reader then knows exactly what the result is and does not have to jump away to examine method's return type
val eligibleSegment2EligibleCountries: List[(Int, Set[Int])] = extractEligibleSegment2EligibleCountries(filteredSegment2PositiveCount, segmentCountry2PositiveCounts)
Bit better but it can be dramatically improved by using type aliases. In this domain, Segment ID and Country ID are Integers so let's create alias for them
type SegmentId = Int
type CountryId = Int
Now collection type becomes really self-explanatory to reader and we can also save some characters on variable and method names
val es2c: List[(SegmentId, Set[CountryId])] = extract(fs2pcount, sc2pcount)
But there are also have other numeric types which have some special meaning in the domain of the job - Counts and Rates
type Count = Long
type SampleRate = Double
Then very easy to understand method signatures or variable types can be used
def counts2rates(counts: Map[SegmentId, Count], maxLimit: Long): Map[SegmentId, SampleRate]
def train(exploded: RDD[(SegmentId, DmpCookieProfile)]): RDD[Either[SegmentId, (SegmentId, Count)]]

One might consider aliasing whole Map or List but I believe it would contrary obscure collection data types rather then helping reader trying to understand the code

Happy type aliasing

Thursday 16 March 2017

Apache Spark sample and coalesce

It is wasteful to work with large number of small partitions of data, especially when S3 is used as data storage. IO then becomes unacceptably large part of time spent in task, most annoyingly when Spark is just moving data files from _temporary location into final destination, after real work has been completed.

Small tip how to down-sample dataset but keeping resulting partitions sizes similar to original dataset. Key element is to get original number of partition, decrease it accordingly and coalesce RDD with it.

It is useful when you test your Spark job on single node (or few) but with the amount of data you expect production cluster nodes should be able to handle.


  def sample(sparkCtx: SparkContext, inputPath: String, outputPath: String, fraction: Double, seed: Int = 1) = {
    val inputRdd = sparkCtx.textFile(inputPath)
    val outputPartitions = (inputRdd.partitions.length * fraction).toInt
    inputRdd
    	.sample(false, fraction, seed)
    	.coalesce(outputPartitions, shuffle = false)
        .saveAsTextFile(outputPath, codec = classOf[GzipCodec])
  }

Happy sampling!

Monday 20 June 2016

Maven, SBT, Gradle local repository sharing

I've run into environment where different project were using different build tools - Maven, Gradle and SBT. Problem was how to reuse build artifacts between them and here goes small summary of my investigation. I will cover only local file repositories and NOT publishing/retrieving to/from remote repositories.

TL;DR every tool is able to use publish into and retrieve from Maven local repository

I'm using today's latest versions of Maven 3.3.9, Gradle 2.14, SBT 0.13.11 While Maven dependency management and artifact publishing is pretty stable, SBT and Gradle are still evolving quite a lot so check your versions carefully.

All build tools (unless you override default configuration) use your home directory to keep local repository inside of it. It is different on various operating systems and with my user name mvanek, then it will be

  • Linux ${USER_HOME} is /home/mvanek
  • Mac OS X ${USER_HOME} is /Users/mvanek
  • Windows 10 %USERPROFILE% is C:\Users\mvanek

Maven 3.3.9

  • mvn package - Operates inside target subdirectory of your project
  • mvn install - Also copy jar into ${USER_HOME}/.m2/repository

SBT 0.13.11

sbt build operates inside target subdirectory of your project

Let's have following simple build.sbt:
organization := "bt"
name := "bt-sbt"
version := "1.0.0-SNAPSHOT"

scalaVersion := "2.11.7"
sbtVersion := "0.13.11"

resolvers += Resolver.mavenLocal // Also use $HOME/.m2/repository

libraryDependencies += "commons-codec" % "commons-codec" % "1.10" 
libraryDependencies += "org.scalatest" %% "scalatest" % "2.2.6" % "test"

SBT -> Ivy

SBT is using Apache Ivy internally so executing sbt publish-local (which same as sbt publishLocal) will build and store produced jars into ${USER_HOME}/.ivy2/local/bt/bt-sbt_2.11 subdirectories Because Scala applications are bound to Scala version they are compiled with, Ivy module/Maven artifactId/Jar file name will be bt-sbt_2.11 to reflect that.

SBT -> Maven

Since SBT 0.13.0 publishing into Maven local repo is trivial. Execute sbt publish-m2 which is same as sbt publishM2 to build and store compiled binary, source and javadoc jars into ${USER_HOME}/.m2/repository/bt/bt-sbt_2.11/1.0.0-SNAPSHOT

Note that for file cache of remote repository's artifacts SBT uses ${USER_HOME}/.ivy2/cache directory.

Gradle 2.14

Gradle by itself does not have concept of local repository. gradle build - Operates inside build subdirectory of your project

Gradle's build behaviour changes with plugins applied inside build.gradle. Let's assume only java plugin is used in simple build.gradle file
apply plugin: 'java'

group = 'bt'
version = '1.0.0-SNAPSHOT'

task sourceJar(type: Jar) {
    from sourceSets.main.allJava
}

repositories {
    mavenCentral()
}

dependencies {
 compile group: 'commons-codec', name: 'commons-codec', version: '1.10'
 testCompile group: 'junit', name: 'junit', version: '4.12'
}

Name of the directory that contains Gradle project files is taken project name. This is important because project name will become Maven artifactId and it cannot be redefined in build.gradle. You can redefine it using settings.gradle file with line rootProject.name = 'bt-gradle'. There are also options to configure it just for individual plugins but I'd not recommend it because then you need to configure it for every plugin that might be affected (some publish plugin for example)

Gradle -> Maven

For sharing with Maven, probably easiest choice is maven plugin

Apply plugin in your build.gradle apply plugin: 'maven' then execute build via gradle install and your jar will land in ${USER_HOME}/.m2/repository/bt/bt-gradle/1.0.0-SNAPSHOT/bt-gradle-1.0.0-SNAPSHOT.jar

Another Gradle -> Maven option is maven-publish plugin which might be good choice if you also plan to publish artifacts into remote repository as well.

Add into build.gradle

apply plugin: 'maven-publish'

publishing {
    publications {
        mavenJava(MavenPublication) {
            from components.java
        }
    }
}
and execute build using gradle publishToMavenLocal

Gradle -> Ivy

Because SBT is using Ivy distribution instead of Maven, ivy-publish plugin should be weapon of choice for this.
apply plugin: 'ivy-publish'

publishing {
    publications {
        ivyJava(IvyPublication) {
            from components.java
        }
    }
    
    repositories {
        ivy {
            url "${System.properties['user.home']}/.ivy2/local"
        } 
    }
}
then execute buid using gradle publishIvyJavaPublicationToIvyRepository and jar will appear in ${USER_HOME}/.ivy2/local/bt/gradle-built/1.0.0-SNAPSHOT/gradle-built-1.0.0-SNAPSHOT.jar. This is unfortunately wrong as it is Maven layout, not Ivy. You can add layout explicitly
    repositories {
        ivy {
            layout "ivy"
            url "${System.properties['user.home']}/.ivy2/local"
        } 
    }
which is better and will work for simple jar build but it will fail when you attach sources. Yet another reason why I'll never use Gradle unless I'm tortured for a week at least.

Note that for file cache of remote repository's artifacts Gradle uses ${USER_HOME}/.gradle/caches/modules-2/files-2.1 directory

Happy local jar sharing!

Tuesday 22 March 2016

Fun with Spring Boot auto-configuration

I favour configuration as straighforward as possible. Unfortunately Spring Boot is pretty much opposite as it employs lots of auto-configuration. Sometimes it is way too eager and initializes everything it stumbles upon.

If you are building fresh new project, you might be spared, but when converting pre-existing project or building something slightly unusual, you will very likely meet Spring Boot initialization failures.

There are two main groups of initialization sources

  • Libraries in classpath - Which might be dragged into classpath as a transitive dependency of library you really use.
  • Beans in application context - Spring Boot auto-configuration often supports only single resource of some kind. Database DataSource or JMS ConnectionFactory for example. When you have more than one, initializer gets confused and fails.
  • Combination of both - otherwise it would be too easy

Complete list of auto configuration classes is listed in documantation. For the record, I'm using Spring Boot 1.3.3 and Spring Framework 4.2.5

Having simple Spring Boot application like this...

@SpringBootApplication
public class AutoConfigBoom {

    @Bean
    @ConfigurationProperties(prefix = "datasource.ds1")
    DataSource ds1() {
        return DataSourceBuilder.create().build();
    }

    public static void main(String[] args) {
        SpringApplication.run(AutoConfigBoom.class, args);
    }
}
... and application.properties...
datasource.ds1.driverClassName=org.mariadb.jdbc.Driver
datasource.ds1.url=jdbc:mysql://localhost:3306/whatever?autoReconnect=true
datasource.ds1.username=hello
datasource.ds1.password=dolly
...if you happen to have JPA implementation like Hibernate in classpath, JPA engine will be initialized automatically. You might spot messages in logging output...
2016-03-22 18:45:55,560|main      |INFO |o.s.o.j.LocalContainerEntityManagerFactoryBean: Building JPA container EntityManagerFactory for persistence unit 'default'
2016-03-22 18:45:55,577|main      |INFO |o.hibernate.jpa.internal.util.LogHelper: HHH000204: Processing PersistenceUnitInfo [
 name: default
 ...]
2016-03-22 18:45:55,639|main      |INFO |org.hibernate.Version: HHH000412: Hibernate Core {5.1.0.Final}
2016-03-22 18:45:55,640|main      |INFO |org.hibernate.cfg.Environment: HHH000206: hibernate.properties not found
2016-03-22 18:45:55,641|main      |INFO |org.hibernate.cfg.Environment: HHH000021: Bytecode provider name : javassist
2016-03-22 18:45:55,678|main      |INFO |org.hibernate.annotations.common.Version: HCANN000001: Hibernate Commons Annotations {5.0.1.Final}
2016-03-22 18:45:55,686|main      |WARN |org.hibernate.orm.deprecation: HHH90000006: Attempted to specify unsupported NamingStrategy via setting [hibernate.ejb.naming_strategy]; NamingStrategy has been removed in favor of the split ImplicitNamingStrategy and PhysicalNamingStrategy; use [hibernate.implicit_naming_strategy] or [hibernate.physical_naming_strategy], respectively, instead.
2016-03-22 18:45:56,049|main      |INFO |org.hibernate.dialect.Dialect: HHH000400: Using dialect: org.hibernate.dialect.MySQL5Dialect
If it is undesired as it only slows down application start up, exclude particular initializers...
@SpringBootApplication(
  exclude = { HibernateJpaAutoConfiguration.class, JpaRepositoriesAutoConfiguration.class })
public class AutoConfigBoom {
  ...
}
...which is just shortcut for...
@Configuration
@EnableAutoConfiguration(
  exclude = { HibernateJpaAutoConfiguration.class, JpaRepositoriesAutoConfiguration.class })
@ComponentScan
public class AutoConfigBoom {
  ...
}

To make things more spicy, let's declare second DataSource. Let's assume that XA Transactions spanning multiple transactional resources are not required.

@SpringBootApplication
public class AutoConfigBoom {

    @Bean
    @ConfigurationProperties(prefix = "datasource.ds1")
    DataSource ds1() {
        return DataSourceBuilder.create().build();
    }

    @Bean
    @ConfigurationProperties(prefix = "datasource.ds2")
    DataSource ds2() {
        return DataSourceBuilder.create().build();
    }

    public static void main(String[] args) {
        SpringApplication.run(AutoConfigBoom.class, args);
    }
}
If you have Hibernate JPA in classpath you will get...
org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'org.springframework.boot.autoconfigure.orm.jpa.HibernateJpaAutoConfiguration': Injection of autowired dependencies failed; 
nested exception is org.springframework.beans.factory.BeanCreationException: Could not autowire field: private javax.sql.DataSource org.springframework.boot.autoconfigure.orm.jpa.JpaBaseConfiguration.dataSource; 
nested exception is org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'ds2' defined in com.example.boom.AutoConfigBoom: Initialization of bean failed; nested exception is org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'dataSourceInitializer': Invocation of init method failed; 
nested exception is org.springframework.beans.factory.NoUniqueBeanDefinitionException: No qualifying bean of type [javax.sql.DataSource] is defined: expected single matching bean but found 2: ds2,ds1
...because JPA engine is now confused which DataSource to use. If you really intend to use JPA EntityManager, then easiest solution is mark one DataSource with @Primary annotation...

    @Primary
    @Bean
    @ConfigurationProperties(prefix = "datasource.ds1")
    DataSource ds1() {
        return DataSourceBuilder.create().build();
    }

...otherwise exclude HibernateJpaAutoConfiguration.class, JpaRepositoriesAutoConfiguration.class from auto-configuration as shown above. @Primary annotation will also help you in case of...
org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'ds1' defined in com.example.boom.AutoConfigBoom: Initialization of bean failed; 
nested exception is org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'dataSourceInitializer': Invocation of init method failed; 
nested exception is org.springframework.beans.factory.NoUniqueBeanDefinitionException: No qualifying bean of type [javax.sql.DataSource] is defined: expected single matching bean but found 2: ds2,ds1
Now, DataSourceInitializer cannot be simply excluded, but it can be turned off by adding...
spring.datasource.initialize=false
into application.properties, but you will get yet another initialization failure...
org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'org.springframework.boot.autoconfigure.jdbc.DataSourceAutoConfiguration$JdbcTemplateConfiguration': Injection of autowired dependencies failed; 
nested exception is org.springframework.beans.factory.BeanCreationException: Could not autowire field: private javax.sql.DataSource org.springframework.boot.autoconfigure.jdbc.DataSourceAutoConfiguration$JdbcTemplateConfiguration.dataSource; 
nested exception is org.springframework.beans.factory.NoUniqueBeanDefinitionException: No qualifying bean of type [javax.sql.DataSource] is defined: expected single matching bean but found 2: ds1,ds2

...which will force you to exclude DataSourceAutoConfiguration and also DataSourceTransactionManagerAutoConfiguration. Probably not worth it. Rather use @Primary to avoid this madness.

Small tip at last. We had few legacy apps, which were good candidates for Spring Bootification. Here goes typical Spring Boot wrapper I wrote to make upgrade/transition as seamless as possible. It reuses legacy external legacy properties file, which is now well covered in documantation. Part of upgrade was also logging migration to slf4j and logback (SLF4JBridgeHandler)

public final class LegacyAppWrapper {

    static {
        SLF4JBridgeHandler.removeHandlersForRootLogger();
        SLF4JBridgeHandler.install();
    }

    public static void main(String[] args) {
        // Do not do initialization tricks
        System.setProperty(
          "spring.datasource.initialize", "false");

        // Stop script does: curl -X POST http://localhost:8080/shutdown
        System.setProperty(
          "endpoints.shutdown.enabled", "true");

        // Instruct Spring Boot to use our legacy external property file
        String configFile = "file:" + System.getProperty("legacy.config.dir") + "/" + "legacy-config.properties";
        System.setProperty(
          "spring.config.location", configFile);

        ConfigurableApplicationContext context = SpringApplication.run(LegacyAppSpringBootConfig.class, args); // embedded http server blocks until shutdown
    }
}

Happy auto-configuration exclusions

Friday 18 March 2016

Join unrelated entities in JPA

With SQL, you can join pretty much any two tables on almost any columns that have compatible type. This is not the possible in JPA as it relies heavily on relation mapping.

JPA relation between two entities must be declared, usually using @OnToMany or @ManyToOne or another annotation on some JPA entity's field. But sometimes you cannot simply introduce new relation into your existing domain model as it can also bring all sorts of troubles, like unnecessary fetches or lazy loads. Then ad hoc joining comes very handy.

JPA 2.1 introduced ON clause support, which is useful but not ground breaking. It only allows to use additional joining condition to one that implicitly exists because of entity relation mapping.

To cut story short, JPA specification still does not allow ad hoc joins on unrelated entities, but fortunately both two most used JPA implementation can do it now.

Of course, you still have to map columns (most likely numeric id columns) using @Id or @Column annotation, but you do not have to declare relation between entities to be able to join them.

EclipseLink since version 2.4 - User Guide and Ticket

Hibernate starting with version 5.1.0 released this February - Announcement and Ticket

Using this powerful tool, we can achieve unimaginable.

@Entity
@Table(name = "CAT")
class Cat {

    @Id
    @Column(name = "NAME")
    private String name;

    @Column(name = "KITTENS")
    private Integer kittens;
}

@Entity
@Table(name = "INSURANCE")
class Insurance {

    @Id
    @Column(name = "NUMBER")
    private Integer number;

    @Temporal(TemporalType.DATE)
    private Date startDate;
}

For example to join number of kittens your cat has with your insurance number!
String jpaql="SELECT Cat c JOIN Insurance i ON c.kittens = i.number";
entityManager.createQuery(jpaql);

If you try this with Hibernate version older than 5.1.0 you will get QuerySyntaxException
Caused by: org.hibernate.hql.internal.ast.QuerySyntaxException: Path expected for join!

Happy ad hoc joins!

Monday 16 November 2015

Upgrading Querydsl 3 to 4

Querydsl is for some time now my weapon of choice when it comes to writing typesafe JPA queries. Main advantage is great readibility of code comparing to standard JPA Criteria API. Sometimes I also use Querydsl SQL module when JPA is not avaliable or JPA entity mapping is too poor (sadly common case) that can't be used in complex queries.

Anyway, today I've been upgrading from Querydsl 3.6.x to latest 4.0.x and I've met numerous API changes. Unfortunately I did not find any migration instructions so here comes my notes. Might be useful to someone.

Simple query

Before:
QEntity qEntity = QEntity.entity;
List<Entity> list = new JPAQuery(emDomain).from(qEntity).list(qEntity);
After:
QEntity qEntity = QEntity.entity;
List<Entity> list = new JPAQuery<Entity>(emDomain).from(qEntity).fetch();

SearchResults -> QueryResults

Before:
SearchResults<Tuple> results = query.listResults(qEntity.id, qEntity.name);
After:
QueryResults<Tuple> results = query.select(qEntity.id, qEntity.name).fetchResults();

Join fetch

Before:
query.leftJoin(qEntity.approvedCreatives).fetch();
After:
query.leftJoin(qEntity.approvedCreatives).fetchJoin();

Single result

Before:
Tuple tuple = query.singleResult(qEntity.id, qEntity.name);
After:
Tuple tuple = query.select(qEntity.id, qEntity.name).fetchOne();

Subqueries

Before:
JPAQuery query = new JPAQuery(emDomain);
ListSubQuery<Long> subQuery = new JPASubQuery().from(qCampaing).where(qCampaing.id.eq(campaingId)).join(qCampaing.segments, qSegment).list(qSegment.id);
query.from(qSegment).where(qSegment.id.in(subQuery));
After:
JPAQuery<Segment> query = new JPAQuery(emDomain);
JPQLQuery<Long> subQuery = JPAExpressions.selectFrom(qCampaing).where(qCampaing.id.eq(campaingId)).join(qCampaing.segments, qSegment).select(qSegment.id);
query.from(qSegment).where(qSegment.id.in(subQuery));

Happy typesafe Querdsling!