Skip to content

Conversation

@Shimuuar
Copy link
Contributor

@Shimuuar Shimuuar commented Sep 3, 2025

This is proof of concept implementation of #535 for primitive vectors.

Idea work out nicely. There's one small complication. It makes sense to define unsafe function working on representation of a vector but we get name clashes: both mutable and pure primitive vectors defined unsafeCast so it pprobab;y makes sense to keep such function where they're currently defined

Reviews are very welcome.

Fixes #535, Fixes #524, Fixes #517, Fixes #506, Fixes #492, Fixes #171

@lehins
Copy link
Contributor

lehins commented Sep 3, 2025

both mutable and pure primitive vectors defined unsafeCast

I think it would make sense to add an Data.Primitive.Unsafe and Data.Primitive.Mutable.Unsafe modules to separate the concerns. This is what I was suggesting in #361 which was scrapped. But, considering that this idea is coming up again and again, maybe it is a good idea after all?

@Shimuuar
Copy link
Contributor Author

Shimuuar commented Sep 5, 2025

Yes it seems in the end library authors need to provide access to internals and it's better to do it systematic way. I hoped to get away with single Unsafe module but looks like 2 are needed.

@Shimuuar Shimuuar mentioned this pull request Sep 14, 2025
@Shimuuar Shimuuar force-pushed the unsafe-constructor branch 2 times, most recently from e7237e3 to 32c7585 Compare September 18, 2025 18:44
@Shimuuar
Copy link
Contributor Author

I implemented unsafe modules for everything except Unboxed. It's quite a bit different from other vectors and still require some cleanup. Haddocks are not ready either

General idea is to move definition of data type and all function that work directly on vector representation into Unsafe module. It has nice side effect of separating code specific to vector type from endless specializations.

At the moment all function safe and unsafe are simply reexported from safe modules. Deprecation warning is only applied to pattern synonym.

@Shimuuar
Copy link
Contributor Author

Shimuuar commented Oct 3, 2025

Now PR is mostly done. Implementation follows rules below:

  • For all vector types except unboxed. Two Unsafe modules are added. They provide access to constuctors. All function what work on internal representation are defined there. Exported constructors from safe modules are replaced with deprecated pattern synonyms.
  • For unboxed vectors. Data.Vector.Unboxed.Base renamed to Data.Vector.Unboxed.Unsafe. All constructors for data instances which are safe to expose are exposed (both mutable and immutable). These are newtype wrappers over vector representation. Exports of unsafe constructors

Remaining questions

  1. Right now only constructors are deprecated. Do we want to deprecate unsafe functions as well? (unsafe{From,To}ForeignPtr0, unsafeCoerceMVector, unsafeCast, etc) now they are simply reexported from "safe" modules.

  2. Renaming Data.Vector.Unboxed.BaseData.Vector.Unboxed.Unsafe is breaking change I think. However I want to rework size hints in bundles. And that will be breaking change anyway

  3. New modules are not documented properly yet.

@Shimuuar Shimuuar force-pushed the unsafe-constructor branch from 6e2ceb8 to a44dfb1 Compare October 3, 2025 14:17
They contain publicly accessible definitions of vector data types and unsafe
operations on structure.
We have 3 sorts of data instances:

 1. Constructors that we must exports. Data instances which are used in deriving
    via (UnboxViaPrim etc.). When constructors aren't visible deriving won't
    work. They were exported before.

 2. Constructors which are safe to use. Newtype vectors over underlying
    representation. Now they're exported from both immutable and mutable

 3. Dangerous and unsafe ones: tuple and unit instances. They're exported from
    Unsafe module. Mutable module reexports deprecated pattern synonym.
@Shimuuar Shimuuar marked this pull request as ready for review January 25, 2026 12:23
@Shimuuar
Copy link
Contributor Author

I think PR is ready.

@Shimuuar
Copy link
Contributor Author

@lehins Do you have any comments? If no I'm going to merge this PR next weekend

@lehins
Copy link
Contributor

lehins commented Jan 31, 2026

Sorry, I did want to review this PR.
I was travelling this week, so didn't have time to do this. Give me a few days and I will make sure to review it

@Shimuuar
Copy link
Contributor Author

Shimuuar commented Feb 1, 2026

No problem. I worked on PR for 4 month on and off. It can wait.

On second thought dropping D.V.Unboxed.Base was too risque. It's better to deprecate that module and reexport content of new D.V.Unboxed.Unsafe

Copy link
Contributor

@lehins lehins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aside from some functions unfairly being tainted as unsafe, I believe this is a very nice start for separating safe vs unsafe interface

Comment on lines +170 to +187
-- Conversions - Arrays
-- -----------------------------

-- | /O(n)/ Make a copy of a mutable array to a new mutable vector.
--
-- @since 0.12.2.0
fromMutableArray :: PrimMonad m => MutableArray (PrimState m) a -> m (MVector (PrimState m) a)
{-# INLINE fromMutableArray #-}
fromMutableArray marr =
let size = sizeofMutableArray marr
in MVector 0 size `liftM` cloneMutableArray marr 0 size

-- | /O(n)/ Make a copy of a mutable vector into a new mutable array.
--
-- @since 0.12.2.0
toMutableArray :: PrimMonad m => MVector (PrimState m) a -> m (MutableArray (PrimState m) a)
{-# INLINE toMutableArray #-}
toMutableArray (MVector offset size marr) = cloneMutableArray marr offset size
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure I understand why these two functions are moved to the unsafe module?
They both allocate a new array and make a full copy using safe bounds. I can't see anything unsafe about them.

Comment on lines +53 to +54
type IOVector = MVector RealWorld
type STVector s = MVector s
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these two synonyms can go back into the safe module. They are totally safe and aren't needed in this module

Comment on lines +248 to +278
-- Conversions - Lazy vectors
-- -----------------------------

-- | /O(1)/ Convert strict array to lazy array
toLazy :: Vector a -> V.Vector a
toLazy (Vector v) = v

-- | /O(n)/ Convert lazy array to strict array. This function reduces
-- each element of vector to WHNF.
fromLazy :: V.Vector a -> Vector a
fromLazy vec = liftRnf (`seq` ()) v `seq` v where v = Vector vec


-- Conversions - Arrays
-- -----------------------------

-- | /O(n)/ Convert an array to a vector and reduce each element to WHNF.
--
-- @since 0.13.2.0
fromArray :: Array a -> Vector a
{-# INLINE fromArray #-}
fromArray arr = liftRnf (`seq` ()) vec `seq` vec
where
vec = Vector $ V.fromArray arr

-- | /O(n)/ Convert a vector to an array.
--
-- @since 0.13.2.0
toArray :: Vector a -> Array a
{-# INLINE toArray #-}
toArray (Vector v) = V.toArray v
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same with these guys. I don't think there is anything unsafe about them at all. They all preserve the necessary invariants. They force the elements when needed and make copies when slicing cannot be preserved.

-- @since 0.13.2.0
toArraySlice :: Vector a -> (Array a, Int, Int)
{-# INLINE toArraySlice #-}
toArraySlice (Vector v) = V.toArraySlice v
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one I can believe, for example, that it can be unsafe, since it is possible that some part of the result array will be uninitialized

Comment on lines +296 to +313
-- Conversions - Arrays
-- -----------------------------

-- | /O(1)/ Convert an array to a vector.
--
-- @since 0.12.2.0
fromArray :: Array a -> Vector a
{-# INLINE fromArray #-}
fromArray arr = Vector 0 (sizeofArray arr) arr

-- | /O(n)/ Convert a vector to an array.
--
-- @since 0.12.2.0
toArray :: Vector a -> Array a
{-# INLINE toArray #-}
toArray (Vector offset len arr)
| offset == 0 && len == sizeofArray arr = arr
| otherwise = cloneArray arr offset len
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here. These conversions are totally safe

@Shimuuar
Copy link
Contributor Author

@lehins Yes I moved some safe functions to Unsafe module. I found that when all function which work on vector representation directly and not via Vector/MVector type classes are defined in one place, it improves readability. And there's little harm in exporting safe function from unsafe module. Similarly Unsafe module for unboxed vector has a lot of stuff that's quite safe.

Real question is what should go into Unsafe modules and what should be eventually removed from safe ones?

  • Right now all function that work on vector representation directly are in defined in unsafe module. I think it's right choice
  • Should we deprecate reexport of unsafe functions?
  • Should we move unsafe indexing function (unsafeIndex etc) to unsafe modules and deprecate it as well?
  • If so should we add D.V.Generic.Unsafe as well for unsafe indexing?

What do you think about creating another one directional pattern synonym that would allow continue pattern matching on vectors in a safe manner? Can't think of any good names for it though.

What about PrimVector/PrimMVector etc?

@lehins
Copy link
Contributor

lehins commented Feb 10, 2026

Yes I moved some safe functions to Unsafe module. I found that when all function which work on vector representation directly and not via Vector/MVector type classes are defined in one place, it improves readability. And there's little harm in exporting safe function from unsafe module.

I do not agree with this at all. This will make people wrongfully believe that those functions are unsafe. And I can certainly see harm in it!

Real question is what should go into Unsafe modules and what should be eventually removed from safe ones?

Anything that is unsafe should IMHO be eventually moved: unsafeIndex, unsafeFreeze, etc. It should not be only about access to constructors, it should be about safe vs unsafe functionality.

Right now all function that work on vector representation directly are in defined in unsafe module. I think it's right choice

Totally disagree. There is nothing wrong in providing a safe function that itself uses unsafe constructor.

Should we deprecate reexport of unsafe functions?

Yes, I believe so.

Should we move unsafe indexing function (unsafeIndex etc) to unsafe modules and deprecate it as well?

Yes, I believe so as well

If so should we add D.V.Generic.Unsafe as well for unsafe indexing?

I do believe that would be a good idea for consistency and safety sake

What about PrimVector/PrimMVector etc?

Doesn't sound too bad. Something like this?

  • PrimVector/PrimMVector
  • StorableVector/StorableMVector`
  • BoxedLazyVector/BoxedLazyMVector
  • BoxedStrictVector/BoxedStrictMVector

If these seem too long maybe we could shorten them:

  • PVector/PMVector
  • SVector/SMVector`
  • BLVector/BLMVector
  • BSVector/BSMVector

We could also provide matching type synonyms, so users could just import the type synonyms and use Generic interface if they so choose.

Put module name in DEPRECATE pragma in quotes. This way haddock is able to link
to module.

Use MVector(MVector) in export list when importing from module which exports
pattern instead of constructor. This approach doesn't work when pattern is
defined in module.

Co-authored-by: Alexey Kuleshevich <alexey@kuleshevi.ch>
@Shimuuar
Copy link
Contributor Author

It turns that export of Vector(Vector) does not pick pattern defined in module, it packs imported constructor instead. But when module export synonym exporting Mvector(MVector) works fine. It seems only way to keep import ... MVector(..) working is to define deprecated patterns in separate module.

Also haddock doesn't link properly 'U.Vector' and requires full qualification.

@Shimuuar
Copy link
Contributor Author

I do not agree with this at all. This will make people wrongfully believe that those functions are unsafe. And I can certainly see harm in it!

So we disagree on this but your approach is valid as well and I don't feel to strong about his. Let do it your way. This PR is uncomfortably big already so I propose following plan:

  1. Move safe functions back to safe module and merge this PR. It's good intermediate point
  2. Next PR implements deprecation of unsafe conversion functions
  3. Third PR move unsafe indexing into Unsafe modules and deprecates their use from safe modules

We could also provide matching type synonyms, so users could just import the type synonyms and use Generic interface if they so choose.

BTW exactly this was requested in #240 but I think it's better to do this separately

@lehins
Copy link
Contributor

lehins commented Feb 10, 2026

I 100% agree with this plan: #540 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

2 participants