Packages

final case class Discount(k: Int, minimizers: MinimizerSource = Bundled, m: Int = 10, ordering: MinimizerOrdering = Frequency, sample: Double = 0.01, maxSequenceLength: Int = 1000000, normalize: Boolean = false, method: CountMethod = Auto, partitions: Int = 200)(implicit spark: SparkSession) extends Product with Serializable

Main API entry point for Discount. Also see the command line examples in the documentation for more information on these options.

k

k-mer length

minimizers

source of minimizers. See MinimizerSource

m

minimizer width

ordering

minimizer ordering. See MinimizerOrdering

sample

sample fraction for frequency orderings

maxSequenceLength

max length of a single sequence (for short reads)

normalize

whether to normalize k-mer orientation during counting. Causes every sequence to be scanned in both forward and reverse, after which only forward orientation k-mers are kept.

method

counting method to use (or None for automatic selection). See CountMethod

partitions

number of shuffle partitions/index buckets

spark

the SparkSession

Linear Supertypes
Serializable, Serializable, Product, Equals, AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. Discount
  2. Serializable
  3. Serializable
  4. Product
  5. Equals
  6. AnyRef
  7. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new Discount(k: Int, minimizers: MinimizerSource = Bundled, m: Int = 10, ordering: MinimizerOrdering = Frequency, sample: Double = 0.01, maxSequenceLength: Int = 1000000, normalize: Boolean = false, method: CountMethod = Auto, partitions: Int = 200)(implicit spark: SparkSession)

    k

    k-mer length

    minimizers

    source of minimizers. See MinimizerSource

    m

    minimizer width

    ordering

    minimizer ordering. See MinimizerOrdering

    sample

    sample fraction for frequency orderings

    maxSequenceLength

    max length of a single sequence (for short reads)

    normalize

    whether to normalize k-mer orientation during counting. Causes every sequence to be scanned in both forward and reverse, after which only forward orientation k-mers are kept.

    method

    counting method to use (or None for automatic selection). See CountMethod

    partitions

    number of shuffle partitions/index buckets

    spark

    the SparkSession

Value Members

  1. final def !=(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  4. final def asInstanceOf[T0]: T0
    Definition Classes
    Any
  5. def clone(): AnyRef
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native()
  6. def emptyIndex(inFiles: String*): Index

    Construct an empty index, using the supplied sequence files to prepare the minimizer ordering.

    Construct an empty index, using the supplied sequence files to prepare the minimizer ordering. This is useful when a frequency ordering is used and one wants to sample a large number of files in advance. Index.newCompatible or index(compatible: Index, inFiles: String*) can then be used to construct compatible indexes with actual k-mers using the resulting ordering.

    inFiles

    The input files to sample for frequency orderings

  7. final def eq(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  8. def finalize(): Unit
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  9. final def getClass(): Class[_]
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  10. def getInputFragments(file: String, addRCReads: Boolean = false): Dataset[InputFragment]

    Single file version of the same method

  11. def getInputFragments(files: Seq[String], addRCReads: Boolean): Dataset[InputFragment]

    Load input fragments (with sequence title and location) according to the settings in this object.

    Load input fragments (with sequence title and location) according to the settings in this object.

    files

    input files

    addRCReads

    whether to add reverse complements

  12. def getInputSequences(file: String, addRCReads: Boolean = false): Dataset[NTSeq]

    Single file version of the same method

  13. def getInputSequences(files: Seq[String], addRCReads: Boolean): Dataset[NTSeq]

    Load reads/sequences from files according to the settings in this object.

    Load reads/sequences from files according to the settings in this object.

    files

    input files

    addRCReads

    whether to add reverse complements

  14. def getSplitter(inFiles: Option[Seq[String]], persistHash: Option[String] = None): MinSplitter[_ <: MinimizerPriorities]

    Construct a read splitter for the given input files based on the settings in this object.

    Construct a read splitter for the given input files based on the settings in this object.

    inFiles

    Input files (for frequency orderings, which require sampling)

    persistHash

    Location to persist the generated minimizer ordering (for frequency orderings), if any

    returns

    a MinSplitter configured with a minimizer ordering and corresponding MinTable

  15. def index(compatible: Index, inFiles: String*): Index

    Convenience method to construct a compatible counting k-mer index containing all k-mers from the input sequence files.

    Convenience method to construct a compatible counting k-mer index containing all k-mers from the input sequence files.

    compatible

    Compatible index to copy settings, such as an existing minimizer ordering, from

    inFiles

    input files

  16. def index(inFiles: String*): Index

    Convenience method to construct a counting k-mer index containing all k-mers from the input sequence files.

    Convenience method to construct a counting k-mer index containing all k-mers from the input sequence files. If a frequency minimizer ordering is used (which is the default), the input files will be sampled and a new minimizer ordering will be constructed.

    inFiles

    input files

  17. def inputReader(files: String*): Inputs

    Obtain an InputReader configured with settings from this object.

    Obtain an InputReader configured with settings from this object.

    files

    Files to read. Can be a single file or multiple files. Wildcards can be used. A name of the format @list.txt will be parsed as a list of files.

  18. final def isInstanceOf[T0]: Boolean
    Definition Classes
    Any
  19. val k: Int
  20. def kmers(knownSplitter: Broadcast[AnyMinSplitter], inFiles: String*): Kmers

    Load k-mers from the given files.

  21. def kmers(inFiles: String*): Kmers

    Load k-mers from the given files.

  22. val m: Int
  23. val maxSequenceLength: Int
  24. val method: CountMethod
  25. val minimizers: MinimizerSource
  26. final def ne(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  27. val normalize: Boolean
  28. final def notify(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  29. final def notifyAll(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  30. val ordering: MinimizerOrdering
  31. val partitions: Int
  32. val sample: Double
  33. def sequenceTitles(input: String*): Dataset[SeqTitle]

    Load sequence titles only from the given input files

  34. final def synchronized[T0](arg0: ⇒ T0): T0
    Definition Classes
    AnyRef
  35. final def wait(): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  36. final def wait(arg0: Long, arg1: Int): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  37. final def wait(arg0: Long): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native()

Inherited from Serializable

Inherited from Serializable

Inherited from Product

Inherited from Equals

Inherited from AnyRef

Inherited from Any

Ungrouped