Discount

Companion object Discount

final case class Discount(k: Int, minimizers: MinimizerSource = Bundled, m: Int = 10, ordering: MinimizerOrdering = Frequency, sample: Double = 0.01, maxSequenceLength: Int = 1000000, normalize: Boolean = false, method: CountMethod = Auto, partitions: Int = 200)(implicit spark: SparkSession) extends Product with Serializable

Main API entry point for Discount. Also see the command line examples in the documentation for more information on these options.

k: k-mer length
minimizers: source of minimizers. See MinimizerSource
m: minimizer width
ordering: minimizer ordering. See MinimizerOrdering
sample: sample fraction for frequency orderings
maxSequenceLength: max length of a single sequence (for short reads)
normalize: whether to normalize k-mer orientation during counting. Causes every sequence to be scanned in both forward and reverse, after which only forward orientation k-mers are kept.
method: counting method to use (or None for automatic selection). See CountMethod
partitions: number of shuffle partitions/index buckets
spark: the SparkSession

Linear Supertypes

Serializable, Serializable, Product, Equals, AnyRef, Any

Ordering

Alphabetic
By Inheritance

Inherited

Discount
Serializable
Serializable
Product
Equals
AnyRef
Any

Hide All
Show All

Visibility

Public
All

Instance Constructors

new Discount(k: Int, minimizers: MinimizerSource = Bundled, m: Int = 10, ordering: MinimizerOrdering = Frequency, sample: Double = 0.01, maxSequenceLength: Int = 1000000, normalize: Boolean = false, method: CountMethod = Auto, partitions: Int = 200)(implicit spark: SparkSession)
k
k-mer length
minimizers
source of minimizers. See MinimizerSource
m
minimizer width
ordering
minimizer ordering. See MinimizerOrdering
sample
sample fraction for frequency orderings
maxSequenceLength
max length of a single sequence (for short reads)
normalize
whether to normalize k-mer orientation during counting. Causes every sequence to be scanned in both forward and reverse, after which only forward orientation k-mers are kept.
method
counting method to use (or None for automatic selection). See CountMethod
partitions
number of shuffle partitions/index buckets
spark
the SparkSession

Value Members

final def !=(arg0: Any): Boolean

Definition Classes
AnyRef → Any
final def ##(): Int

Definition Classes
AnyRef → Any
final def ==(arg0: Any): Boolean

Definition Classes
AnyRef → Any
final def asInstanceOf[T0]: T0

Definition Classes
Any
def clone(): AnyRef

Attributes
protected[lang]
Definition Classes
AnyRef
Annotations
@throws( ... ) @native()
def emptyIndex(inFiles: String*): Index
Construct an empty index, using the supplied sequence files to prepare the minimizer ordering.
Construct an empty index, using the supplied sequence files to prepare the minimizer ordering. This is useful when a frequency ordering is used and one wants to sample a large number of files in advance. Index.newCompatible or index(compatible: Index, inFiles: String*) can then be used to construct compatible indexes with actual k-mers using the resulting ordering.
inFiles
The input files to sample for frequency orderings
final def eq(arg0: AnyRef): Boolean

Definition Classes
AnyRef
def finalize(): Unit

Attributes
protected[lang]
Definition Classes
AnyRef
Annotations
@throws( classOf[java.lang.Throwable] )
final def getClass(): Class[_]

Definition Classes
AnyRef → Any
Annotations
@native()
def getInputFragments(file: String, addRCReads: Boolean = false): Dataset[InputFragment]
Single file version of the same method
def getInputFragments(files: Seq[String], addRCReads: Boolean): Dataset[InputFragment]
Load input fragments (with sequence title and location) according to the settings in this object.
Load input fragments (with sequence title and location) according to the settings in this object.
files
input files
addRCReads
whether to add reverse complements
def getInputSequences(file: String, addRCReads: Boolean = false): Dataset[NTSeq]
Single file version of the same method
def getInputSequences(files: Seq[String], addRCReads: Boolean): Dataset[NTSeq]
Load reads/sequences from files according to the settings in this object.
Load reads/sequences from files according to the settings in this object.
files
input files
addRCReads
whether to add reverse complements
def getSplitter(inFiles: Option[Seq[String]], persistHash: Option[String] = None): MinSplitter[_ <: MinimizerPriorities]
Construct a read splitter for the given input files based on the settings in this object.
Construct a read splitter for the given input files based on the settings in this object.
inFiles
Input files (for frequency orderings, which require sampling)
persistHash
Location to persist the generated minimizer ordering (for frequency orderings), if any
returns
a MinSplitter configured with a minimizer ordering and corresponding MinTable
def index(compatible: Index, inFiles: String*): Index
Convenience method to construct a compatible counting k-mer index containing all k-mers from the input sequence files.
Convenience method to construct a compatible counting k-mer index containing all k-mers from the input sequence files.
compatible
Compatible index to copy settings, such as an existing minimizer ordering, from
inFiles
input files
def index(inFiles: String*): Index
Convenience method to construct a counting k-mer index containing all k-mers from the input sequence files.
Convenience method to construct a counting k-mer index containing all k-mers from the input sequence files. If a frequency minimizer ordering is used (which is the default), the input files will be sampled and a new minimizer ordering will be constructed.
inFiles
input files
def inputReader(files: String*): Inputs
Obtain an InputReader configured with settings from this object.
Obtain an InputReader configured with settings from this object.
files
Files to read. Can be a single file or multiple files. Wildcards can be used. A name of the format @list.txt will be parsed as a list of files.
final def isInstanceOf[T0]: Boolean

Definition Classes
Any
val k: Int
def kmers(knownSplitter: Broadcast[AnyMinSplitter], inFiles: String*): Kmers
Load k-mers from the given files.
def kmers(inFiles: String*): Kmers
Load k-mers from the given files.
val m: Int
val maxSequenceLength: Int
val method: CountMethod
val minimizers: MinimizerSource
final def ne(arg0: AnyRef): Boolean

Definition Classes
AnyRef
val normalize: Boolean
final def notify(): Unit

Definition Classes
AnyRef
Annotations
@native()
final def notifyAll(): Unit

Definition Classes
AnyRef
Annotations
@native()
val ordering: MinimizerOrdering
val partitions: Int
val sample: Double
def sequenceTitles(input: String*): Dataset[SeqTitle]
Load sequence titles only from the given input files
final def synchronized[T0](arg0: ⇒ T0): T0

Definition Classes
AnyRef
final def wait(): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long, arg1: Int): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long): Unit

Definition Classes
AnyRef
Annotations
@throws( ... ) @native()

Packages

Discount

Companion object Discount

Instance Constructors

Value Members

Inherited from Serializable

Inherited from Serializable

Inherited from Product

Inherited from Equals

Inherited from AnyRef

Inherited from Any

Ungrouped

Packages

Discount 

Companion object Discount

Instance Constructors

Value Members

Inherited from Serializable

Inherited from Serializable

Inherited from Product

Inherited from Equals

Inherited from AnyRef

Inherited from Any

Ungrouped

Discount